blackbird-op-linux - Blackbird™ Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
*	x86: EDAC: carve out AMD MCE decoding logic	Borislav Petkov	2009-10-02	3	-6/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This converts the MCE decoding logic into a standalone config option which can be built-in or a module, the first one being the default for MCEs happening early on in the boot process. This, beyond being separated in a cleaner way, also saves RAM by making the decoding logic modular. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091002133148.GD28682@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	initcalls: Add early_initcall() for modules	Borislav Petkov	2009-10-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Complete the early_initcall() API by making it available in modules too. To be used by the EDAC/MCE code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091002132321.GC28682@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	x86: EDAC: MCE: Fix MCE decoding callback logic	Ingo Molnar	2009-10-02	4	-24/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make decoding of MCEs happen only on AMD hardware by registering a non-default callback only on CPU families which support it. While looking at the interaction of decode_mce() with the other MCE code i also noticed a few other things and made the following cleanups/fixes: - Fixed the mce_decode() weak alias - a weak alias is really not good here, it should be a proper callback. A weak alias will be overriden if a piece of code is built into the kernel - not good, obviously. - The patch initializes the callback on AMD family 10h and 11h. - Added the more correct fallback printk of: No support for human readable MCE decoding on this CPU type. Transcribe the message and run it through 'mcelog --ascii' to decode. On CPUs that dont have a decoder. - Made the surrounding code more readable. Note that the callback allows us to have a default fallback - without having to check the CPU versions during the printout itself. When an EDAC module registers itself, it can install the decode-print function. (there's no unregister needed as this is core code.) version -v2 by Borislav Petkov: - add K8 to the set of supported CPUs - always build in edac_mce_amd since we use an early_initcall now - fix checkpatch warnings Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091001141432.GA11410@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	x86: Don't leak 64-bit kernel register values to 32-bit processes	Jan Beulich	2009-10-01	1	-13/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	While 32-bit processes can't directly access R8...R15, they can gain access to these registers by temporarily switching themselves into 64-bit mode. Therefore, registers not preserved anyway by called C functions (i.e. R8...R11) must be cleared prior to returning to user mode. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> LKML-Reference: <4AC34D73020000780001744A@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	x86, SLUB: Remove unused CONFIG FAST_CMPXCHG_LOCAL	Jaswinder Singh Rajput	2009-10-01	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove unused CONFIG FAST_CMPXCHG_LOCAL from Kconfig. Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Robert P. J. Day" <rpjday@crashcourse.ca> Cc: linux-mm@kvack.org LKML-Reference: <1253981501.4568.61.camel@ht.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	x86: earlyprintk: Fix regression to handle serial,ttySn as 1 arg	Jason Wessel	2009-10-01	2	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit c953094 ("early_printk: Allow more than one early console") introduced a regression in the parsing of the earlyprintk= kernel arguments. If you specify "earlyprintk=serial,ttyS0,115200" as a kernel argument, the "serial,ttyS" should be parsed as a single argument and not as "serial" and then "ttyS". Also update the documentation to reflect you can specify the ttyS directly without the "serial" argument. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <gregkh@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> LKML-Reference: <4ABB7D5E.6000301@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=y	Eric Dumazet	2009-10-01	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conditionaly compile cmpxchg8b_emu.o and EXPORT_SYMBOL(cmpxchg8b_emu). This reduces the kernel size a bit. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AC43E7E.1000600@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	x86: Fix csum_ipv6_magic asm memory clobber	Samuel Thibault	2009-10-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a memory clobber, as it is only passed the address of the buffer, not a memory reference to the buffer itself. This caused failures in Hurd's pfinetv4 when we tried to compile it with gcc-4.3 (bogus checksums). Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	x86: Optimize cmpxchg64() at build-time some more	Linus Torvalds	2009-10-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Try to avoid the 'alternates()' code when we can statically determine that cmpxchg8b is fine. We already have that CONFIG_x86_CMPXCHG64 (enabled by PAE support), and we could easily also enable it for some of the CPU cases. Note, this patch only adds CMPXCHG8B for the obvious Intel CPU's, not for others. (There was something really messy about cmpxchg8b and clone CPU's, so if you enable it on other CPUs later, do it carefully.) If we avoid that asm-alternative thing when we can assume the instruction exists, we'll generate less support crud, and we'll avoid the whole issue with that extra 'nop' for padding instruction sizes etc. LKML-Reference: <alpine.LFD.2.01.0909301743150.6996@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
*	Merge branch 'sched-fixes-for-linus' of ↵	Linus Torvalds	2009-09-30	5	-16/+85
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched_clock: Fix atomicity/continuity bug by using cmpxchg64() x86: Provide an alternative() based cmpxchg64()
\| *	sched_clock: Fix atomicity/continuity bug by using cmpxchg64()	Eric Dumazet	2009-09-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit def0a9b2573 (sched_clock: Make it NMI safe) assumed cmpxchg() of 64bit values was available on X86_32. That is not so - and causes some subtle scheduler misbehavior due to incorrect timestamps off to up by ~4 seconds. Two symptoms are known right now: - interactivity problems seen by Arjan: up to 600 msecs latencies instead of the expected 20-40 msecs. These latencies are very visible on the desktop. - incorrect CPU stats: occasionally too high percentages in 'top', and crazy CPU usage stats. Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: John Stultz <johnstul@us.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090930170754.0886ff2e@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| *	x86: Provide an alternative() based cmpxchg64()	Arjan van de Ven	2009-09-30	4	-14/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cmpxchg64() today generates, to quote Linus, "barf bag" code. cmpxchg64() is about to get used in the scheduler to fix a bug there, but it's a prerequisite that cmpxchg64() first be made non-sucking. This patch turns cmpxchg64() into an efficient implementation that uses the alternative() mechanism to just use the raw instruction on all modern systems. Note: the fallback is NOT smp safe, just like the current fallback is not SMP safe. (Interested parties with i486 based SMP systems are welcome to submit fix patches for that.) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> [ fixed asm constraint bug ] Fixed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090930170754.0886ff2e@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* \|	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds	2009-09-30	30	-114/+1852
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: MIPS: Avoid spurious make includecheck message MIPS: VPE: Get rid of BKL. MIPS: VPE: Fix build after the credential changes a while ago. MIPS: Excite: Get rid of BKL. MIPS: Sibyte: Get rid of BKL. MIPS: BCM63xx: Add PCMCIA & Cardbus support. MIPS: MSP71xx: request_irq() failure ignored in msp_pcibios_config_access() MIPS: Decrease size of au1xxx_dbdma_pm_regs[][] MIPS: SMP: Inline arch_send_call_function_{single_ipi,ipi_mask} MIPS: SMP: Fix build. MIPS: MIPSxx SC: Avoid destructive invalidation on partial L2 cachelines. MIPS: Sibyte: Fix compilation error. MIPS: BCM1480: Re-apply patch lost due to bad resolution of merge conflict. MIPS: BCM63xx: Add serial driver for bcm63xx integrated UART. MIPS: Loongson2: Fix typo "enalbe" -> "enable" MIPS: SMTC: Remove duplicate structure field initialization MIPS: Remove duplicated #include MIPS: BCM63xx: Remove duplicated #include
\| * \|	MIPS: Avoid spurious make includecheck message	Ralf Baechle	2009-09-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	arch/mips/include/asm/unaligned.h: linux/unaligned/generic.h is included more than once. Entirely legitimate but just noise. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: VPE: Get rid of BKL.	Ralf Baechle	2009-09-30	2	-42/+50
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: VPE: Fix build after the credential changes a while ago.	Ralf Baechle	2009-09-30	1	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: Excite: Get rid of BKL.	Ralf Baechle	2009-09-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not obvious what good it was supposed to do here anyway. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: Sibyte: Get rid of BKL.	Ralf Baechle	2009-09-30	1	-18/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: BCM63xx: Add PCMCIA & Cardbus support.	Maxime Bizon	2009-09-30	8	-1/+763
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Reviewed-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: MSP71xx: request_irq() failure ignored in msp_pcibios_config_access()	Roel Kluin	2009-09-30	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Produce an error if request_irq() fails. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: "Ithamar R. Adema" <ithamar.adema@team-embedded.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: Decrease size of au1xxx_dbdma_pm_regs[][]	Roel Kluin	2009-09-30	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are 16 individual channels (NUM_DBDMA_CHANS) to save/restore plus the global ddma block config (the +1). The last register in a channel can be skipped since it's read-only (at offset 0x18). Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Manuel Lauss <manuel.lauss@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: SMP: Inline arch_send_call_function_{single_ipi,ipi_mask}	Ralf Baechle	2009-09-30	2	-15/+13
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: SMP: Fix build.	Ralf Baechle	2009-09-30	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 48a048fed82a8e5fdd8618574f6d3de1a0d67a50 Author: Rusty Russell <rusty@rustcorp.com.au> Date: Thu Sep 24 09:34:44 2009 -0600 apparently only passed the "looks good" level of QA ;-) Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: MIPSxx SC: Avoid destructive invalidation on partial L2 cachelines.	Kevin Cernekee	2009-09-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends commit a8ca8b64e3fdfec17679cba0ca5ce6e3ffed092d to cover MIPSxx-style board cache code. Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: Sibyte: Fix compilation error.	Mark Mason	2009-09-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Build error introduced by d4f587c67fc39e0030ddd718675e252e208da4d7. Signed-off-by: Mark Mason <mmason@upwardaccess.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: BCM1480: Re-apply patch lost due to bad resolution of merge conflict.	Ralf Baechle	2009-09-30	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch 14275ccdb1e4b487cca745aba994699c426a31ee and d5dedd4507d307eb3f35f21b6e16f336fdc0d82a are conflicting and the conflict was resolved badly in merge 92241940be501f798cb21db344bbb3d1ec3c4f1c resulting in the BCM1480 changes of 14275ccdb1e4b487cca745aba994699c426a31ee getting lost. Sort out the damage. Reported and initial patch by Mark Mason <mmason@upwardaccess.com>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: BCM63xx: Add serial driver for bcm63xx integrated UART.	Maxime Bizon	2009-09-30	8	-1/+964
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: Loongson2: Fix typo "enalbe" -> "enable"	Uwe Kleine-König	2009-09-30	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Yanhua <yanh@lemote.com> Cc: Robert Richter <robert.richter@amd.com> Acked-by: Wu Zhangjin <wuzj@lemote.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: SMTC: Remove duplicate structure field initialization	Julia Lawall	2009-09-30	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The definition of the irq_ipi structure has two initializations of the flags field. This combines them. [Ralf: The issue was originally introduced by commit be4894196d79455f420dd7bb78be7dc73bec115c (linux-mips.org) rsp. 033890b084adfa367c544864451d7730552ce8bf (kernel.org). The original intention of the code was to initialize .flags with both flags ored together. The broken C code as actually implemented will be compiled by an equally broken gcc to use only the last initialization, that is IRQF_PERCPU which means this turned into an SMTC bug for 2.6.23 and newer.] The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ identifier I, s, fld; position p0,p; expression E; @@ struct I s =@p0 { ... .fld@p = E, ...}; @s@ identifier I, s, r.fld; position r.p0,p; expression E; @@ struct I s =@p0 { ... .fld@p = E, ...}; @script:python@ p0 << r.p0; fld << r.fld; ps << s.p; pr << r.p; @@ if int(ps[0].line)!=int(pr[0].line) or int(ps[0].column)!=int(pr[0].column): cocci.print_main(fld,p0) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: Remove duplicated #include	Huang Weiyi	2009-09-30	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove duplicated #include in arch/mips/kernel/smp.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \|	MIPS: BCM63xx: Remove duplicated #include	Huang Weiyi	2009-09-30	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove duplicated #include in arch/mips/bcm63xx/boards/board_bcm963xx.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* \| \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2009-09-30	2	-0/+2
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix missing initialization of i_dir_start_lookup member nilfs2: fix missing zero-fill initialization of btree node cache
\| * \| \|	nilfs2: fix missing initialization of i_dir_start_lookup member	Ryusuke Konishi	2009-09-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The i_dir_start_lookup field in nilfs_inode_info objects should be cleared when the objects are allocated, but the the initialization was missing in case of reading from disk. This adds the initialization. Since the variable just gives a start page on directory lookups, the bug was nonfatal until now. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
\| * \| \|	nilfs2: fix missing zero-fill initialization of btree node cache	Ryusuke Konishi	2009-09-29	1	-0/+1
\| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will fix file system corruption which infrequently happens after mount. The problem was reported from users with the title "[NILFS users] Fail to mount NILFS." (Message-ID: <200908211918.34720.yuri@itinteg.net>), and so forth. I've also experienced the corruption multiple times on kernel 2.6.30 and 2.6.31. The problem turned out to be caused due to discordance between mapping->nrpages of a btree node cache and the actual number of pages hung on the cache; if the mapping->nrpages becomes zero even as it has pages, truncate_inode_pages() returns without doing anything. Usually this is harmless except it may cause page leak, but garbage collection fairly infrequently sees a stale page remained in the btree node cache of DAT (i.e. disk address translation file of nilfs), and induces the corruption. I identified a missing initialization in btree node caches was the root cause. This corrects the bug. I've tested this for kernel 2.6.30 and 2.6.31. Reported-by: Yuri Chislov <yuri@itinteg.net> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: stable <stable@kernel.org>
* \| \|	Merge branch 'for_linus' of ↵	Linus Torvalds	2009-09-30	20	-751/+1362
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Fix time encoding with extra epoch bits ext4: Add a stub for mpage_da_data in the trace header jbd2: Use tracepoints for history file ext4: Use tracepoints for mb_history trace file ext4, jbd2: Drop unneeded printks at mount and unmount time ext4: Handle nested ext4_journal_start/stop calls without a journal ext4: Make sure ext4_dirty_inode() updates the inode in no journal mode ext4: Avoid updating the inode table bh twice in no journal mode ext4: EXT4_IOC_MOVE_EXT: Check for different original and donor inodes first ext4: async direct IO for holes and fallocate support ext4: Use end_io callback to avoid direct I/O fallback to buffered I/O ext4: Split uninitialized extents for direct I/O ext4: release reserved quota when block reservation for delalloc retry ext4: Adjust ext4_da_writepages() to write out larger contiguous chunks ext4: Fix hueristic which avoids group preallocation for closed files ext4: Use ext4_msg() for ext4_da_writepage() errors ext4: Update documentation about quota mount options
\| * \| \|	ext4: Fix time encoding with extra epoch bits	Theodore Ts'o	2009-09-30	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"Looking at ext4.h, I think the setting of extra time fields forgets to mask the epoch bits so the epoch part overwrites nsec part. The second change is only for coherency (2 -> EXT4_EPOCH_BITS)." Thanks to Damien Guibouret for pointing out this problem. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Add a stub for mpage_da_data in the trace header	Josh Stone	2009-09-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The tracepoint ext4_da_write_pages has a struct mpage_da_data* parameter, but that struct is only defined in fs/ext4/ext4.h. This patch adds a forward declaration for that struct, so this tracepoint header can still be used by tools like SystemTap. This is a continuation of the fix in commit 3661d286. http://sourceware.org/bugzilla/show_bug.cgi?id=10703 Signed-off-by: Josh Stone <jistone@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	jbd2: Use tracepoints for history file	Theodore Ts'o	2009-09-30	5	-228/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The /proc/fs/jbd2/<dev>/history was maintained manually; by using tracepoints, we can get all of the existing functionality of the /proc file plus extra capabilities thanks to the ftrace infrastructure. We save memory as a bonus. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Use tracepoints for mb_history trace file	Theodore Ts'o	2009-09-30	6	-348/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The /proc/fs/ext4/<dev>/mb_history was maintained manually, and had a number of problems: it required a largish amount of memory to be allocated for each ext4 filesystem, and the s_mb_history_lock introduced a CPU contention problem. By ripping out the mb_history code and replacing it with ftrace tracepoints, and we get more functionality: timestamps, event filtering, the ability to correlate mballoc history with other ext4 tracepoints, etc. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4, jbd2: Drop unneeded printks at mount and unmount time	Theodore Ts'o	2009-09-29	5	-22/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are a number of kernel printk's which are printed when an ext4 filesystem is mounted and unmounted. Disable them to economize space in the system logs. In addition, disabling the mballoc stats by default saves a number of unneeded atomic operations for every block allocation or deallocation. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Handle nested ext4_journal_start/stop calls without a journal	Curt Wohlgemuth	2009-09-29	3	-13/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a problem with handling nested calls to ext4_journal_start/ext4_journal_stop, when there is no journal present. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Make sure ext4_dirty_inode() updates the inode in no journal mode	Curt Wohlgemuth	2009-09-29	1	-15/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch a problem that ext4_dirty_inode() was not calling ext4_mark_inode_dirty() if the current_handle is not valid, which it is the case in no journal mode. It also removes a test for non-matching transaction which can never happen. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Avoid updating the inode table bh twice in no journal mode	Frank Mayhar	2009-09-29	1	-21/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a cleanup of commit 91ac6f4. Since ext4_mark_inode_dirty() has already called ext4_mark_iloc_dirty(), which in turn calls ext4_do_update_inode(), it's not necessary to have ext4_write_inode() call ext4_do_update_inode() in no journal mode. Indeed, it would be duplicated work. Reviewed-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Frank Mayhar <fmayhar@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: EXT4_IOC_MOVE_EXT: Check for different original and donor inodes first	Theodore Ts'o	2009-09-28	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the check to make sure the original and donor inodes are different earlier, to avoid a potential deadlock by trying to lock the same inode twice. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: async direct IO for holes and fallocate support	Mingming Cao	2009-09-28	5	-41/+234
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For async direct IO that covers holes or fallocate, the end_io callback function now queued the convertion work on workqueue but don't flush the work rightaway as it might take too long to afford. But when fsync is called after all the data is completed, user expects the metadata also being updated before fsync returns. Thus we need to flush the conversion work when fsync() is called. This patch keep track of a listed of completed async direct io that has a work queued on workqueue. When fsync() is called, it will go through the list and do the conversion. Signed-off-by: Mingming Cao <cmm@us.ibm.com>
\| * \| \|	ext4: Use end_io callback to avoid direct I/O fallback to buffered I/O	Mingming Cao	2009-09-28	3	-1/+210
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the DIO VFS code passes create = 0 when writing to the middle of file. It does this to avoid block allocation for holes, so as not to expose stale data out when there is a parallel buffered read (which does not hold the i_mutex lock). Direct I/O writes into holes falls back to buffered IO for this reason. Since preallocated extents are treated as holes when doing a get_block() look up (buffer is not mapped), direct IO over fallocate also falls back to buffered IO. Thus ext4 actually silently falls back to buffered IO in above two cases, which is undesirable. To fix this, this patch creates unitialized extents when a direct I/O write into holes in sparse files, and registering an end_io callback which converts the uninitialized extent to an initialized extent after the I/O is completed. Singed-Off-By: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Split uninitialized extents for direct I/O	Mingming Cao	2009-09-28	6	-42/+419
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When writing into an unitialized extent via direct I/O, and the direct I/O doesn't exactly cover the unitialized extent, split the extent into uninitialized and initialized extents before submitting the I/O. This avoids needing to deal with an ENOSPC error in the end_io callback that gets used for direct I/O. When the IO is complete, the written extent will be marked as initialized. Singed-Off-By: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: release reserved quota when block reservation for delalloc retry	Mingming Cao	2009-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ext4_da_reserve_space() can reserve quota blocks multiple times if ext4_claim_free_blocks() fail and we retry the allocation. We should release the quota reservation before restarting. Bug found by Jan Kara. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Adjust ext4_da_writepages() to write out larger contiguous chunks	Theodore Ts'o	2009-09-29	4	-16/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Work around problems in the writeback code to force out writebacks in larger chunks than just 4mb, which is just too small. This also works around limitations in the ext4 block allocator, which can't allocate more than 2048 blocks at a time. So we need to defeat the round-robin characteristics of the writeback code and try to write out as many blocks in one inode before allowing the writeback code to move on to another inode. We add a a new per-filesystem tunable, max_writeback_mb_bump, which caps this to a default of 128mb per inode. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
\| * \| \|	ext4: Fix hueristic which avoids group preallocation for closed files	Theodore Ts'o	2009-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The hueristic was designed to avoid using locality group preallocation when writing the last segment of a closed file. Fix it by move setting size to the maximum of size and isize until after we check whether size == isize. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>