summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'perf-probes-for-linus-2' of ↵Linus Torvalds2010-03-0518-835/+2063
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-probes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Issue at least one memory barrier in stop_machine_text_poke() perf probe: Correct probe syntax on command line help perf probe: Add lazy line matching support perf probe: Show more lines after last line perf probe: Check function address range strictly in line finder perf probe: Use libdw callback routines perf probe: Use elfutils-libdw for analyzing debuginfo perf probe: Rename probe finder functions perf probe: Fix bugs in line range finder perf probe: Update perf probe document perf probe: Do not show --line option without dwarf support kprobes: Add documents of jump optimization kprobes/x86: Support kprobes jump optimization on x86 x86: Add text_poke_smp for SMP cross modifying code kprobes/x86: Cleanup save/restore registers kprobes/x86: Boost probes when reentering kprobes: Jump optimization sysctl interface kprobes: Introduce kprobes jump optimization kprobes: Introduce generic insn_slot framework kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE
| * x86: Issue at least one memory barrier in stop_machine_text_poke()Masami Hiramatsu2010-03-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix stop_machine_text_poke() to issue smp_mb() before exiting waiting loop, and use cpu_relax() for waiting. Changes in v2: - Don't use ACCESS_ONCE(). Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <20100304033850.3819.74590.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Correct probe syntax on command line helpMasami Hiramatsu2010-03-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move @SRC right after FUNC in syntax according to syntax change on command line help. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <20100304033843.3819.10087.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Add lazy line matching supportMasami Hiramatsu2010-02-257-99/+298
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add lazy line matching support for specifying new probes. This also changes the syntax of perf probe a bit. Now perf probe accepts one of below probe event definitions. 1) Define event based on function name [EVENT=]FUNC[@SRC][:RLN|+OFF|%return|;PTN] [ARG ...] 2) Define event based on source file with line number [EVENT=]SRC:ALN [ARG ...] 3) Define event based on source file with lazy pattern [EVENT=]SRC;PTN [ARG ...] - New lazy matching pattern(PTN) follows ';' (semicolon). And it must be put the end of the definition. - So, @SRC is no longer the part which must be put at the end of the definition. Note that ';' (semicolon) can be interpreted as the end of a command by the shell. This means that you need to quote it. (anyway you will need to quote the lazy pattern itself too, because it may contains other sensitive characters, like '[',']' etc.). Lazy matching ------------- The lazy line matching is similar to glob matching except ignoring spaces in both of pattern and target. e.g. 'a=*' can matches 'a=b', 'a = b', 'a == b' and so on. This provides some sort of flexibility and robustness to probe point definitions against minor code changes. (for example, actual 10th line of schedule() can be changed easily by modifying schedule(), but the same line matching 'rq=cpu_rq*' may still exist.) Changes in v3: - Cast Dwarf_Addr to uintmax_t for printf-formats. Changes in v2: - Cast Dwarf_Addr to unsigned long long for printf-formats. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133611.6725.45078.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Show more lines after last lineMasami Hiramatsu2010-02-251-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Show 2 more lines after the last probe-able line. This will clearly show the last closed-brace of inline functions. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133604.6725.76820.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Check function address range strictly in line finderMasami Hiramatsu2010-02-252-23/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check (inlined) function address range strictly for improving output of probe-able lines of inline functions. Without this change, perf probe --line <function> sometimes showed other inline function bodies too, because it didn't filter out inlined functions. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133557.6725.20697.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Use libdw callback routinesMasami Hiramatsu2010-02-252-177/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use libdw callback functions aggressively, and remove local tree-search API. This change simplifies the code. Changes in v3: - Cast Dwarf_Addr to uintmax_t for printf-formats. Changes in v2: - Cast Dwarf_Addr to unsigned long long for printf-formats. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133549.6725.81499.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Use elfutils-libdw for analyzing debuginfoMasami Hiramatsu2010-02-254-466/+314
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Newer gcc introduces newer & richer debuginfo, and only libdw in elfutils project can support it. So perf probe moves onto elfutils-libdw from libdwarf. Changes in v3: - Cast Dwarf_Addr/Dwarf_Word to uintmax_t for printf-formats. - Recover a sign-prefix which was removed in v2 by mistake. Changes in v2: - Fix a type-casting bug in Makefile. - Cast Dwarf_Addr/Dwarf_Word to unsigned long long for printf-formats. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133542.6725.34724.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Rename probe finder functionsMasami Hiramatsu2010-02-253-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename *_probepoint to *_probe_point, for nothing but a cosmetic reason. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133534.6725.52615.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Fix bugs in line range finderMasami Hiramatsu2010-02-251-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix find_line_range_by_line() to init line_list and remove misconseptional found marking which should be done when real lines are found (if there is no lines probe-able, find_line_range() should return 0). Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133527.6725.52418.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Update perf probe documentMasami Hiramatsu2010-02-251-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update perf-probe.txt to suit to current perf-probe command and add some examples. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133519.6725.58816.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * perf probe: Do not show --line option without dwarf supportMasami Hiramatsu2010-02-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not show --line option in help message when perf doesn't support dwarf. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133512.6725.88423.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kprobes: Add documents of jump optimizationMasami Hiramatsu2010-02-251-12/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add documentations about kprobe jump optimization to Documentation/kprobes.txt. Changes in v10: - Editorial fixups by Jim Keniston. Changes in v8: - Update documentation and benchmark results. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133504.6725.79395.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kprobes/x86: Support kprobes jump optimization on x86Masami Hiramatsu2010-02-253-22/+441
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce x86 arch-specific optimization code, which supports both of x86-32 and x86-64. This code also supports safety checking, which decodes whole of a function in which probe is inserted, and checks following conditions before optimization: - The optimized instructions which will be replaced by a jump instruction don't straddle the function boundary. - There is no indirect jump instruction, because it will jumps into the address range which is replaced by jump operand. - There is no jump/loop instruction which jumps into the address range which is replaced by jump operand. - Don't optimize kprobes if it is in functions into which fixup code will jumps. This uses text_poke_multibyte() which doesn't support modifying code on NMI/MCE handler. However, since kprobes itself doesn't support NMI/MCE code probing, it's not a problem. Changes in v9: - Use *_text_reserved() for checking the probe can be optimized. - Verify jump address range is in 2G range when preparing slot. - Backup original code when switching optimized buffer, instead of preparing buffer, because there can be int3 of other probes in preparing phase. - Check kprobe is disabled in arch_check_optimized_kprobe(). - Strictly check indirect jump opcodes (ff /4, ff /5). Changes in v6: - Split stop_machine-based jump patching code. - Update comments and coding style. Changes in v5: - Introduce stop_machine-based jump replacing. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133446.6725.78994.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Add text_poke_smp for SMP cross modifying codeMasami Hiramatsu2010-02-252-1/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add generic text_poke_smp for SMP which uses stop_machine() to synchronize modifying code. This stop_machine() method is officially described at "7.1.3 Handling Self- and Cross-Modifying Code" on the intel's software developer's manual 3A. Since stop_machine() can't protect code against NMI/MCE, this function can not modify those handlers. And also, this function is basically for modifying multibyte-single-instruction. For modifying multibyte-multi-instructions, we need another special trap & detour code. This code originaly comes from immediate values with stop_machine() version. Thanks Jason and Mathieu! Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133438.6725.80273.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kprobes/x86: Cleanup save/restore registersMasami Hiramatsu2010-02-251-61/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce SAVE/RESOTRE_REGS_STRING for cleanup kretprobe-trampoline asm code. These macros will be used for emulating interruption. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133430.6725.83342.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kprobes/x86: Boost probes when reenteringMasami Hiramatsu2010-02-251-22/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Integrate prepare_singlestep() into setup_singlestep() to boost up reenter probes, if possible. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133423.6725.12071.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kprobes: Jump optimization sysctl interfaceMasami Hiramatsu2010-02-253-3/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add /proc/sys/debug/kprobes-optimization sysctl which enables and disables kprobes jump optimization on the fly for debugging. Changes in v7: - Remove ctl_name = CTL_UNNUMBERED for upstream compatibility. Changes in v6: - Update comments and coding style. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133415.6725.8274.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kprobes: Introduce kprobes jump optimizationMasami Hiramatsu2010-02-253-51/+459
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce kprobes jump optimization arch-independent parts. Kprobes uses breakpoint instruction for interrupting execution flow, on some architectures, it can be replaced by a jump instruction and interruption emulation code. This gains kprobs' performance drastically. To enable this feature, set CONFIG_OPTPROBES=y (default y if the arch supports OPTPROBE). Changes in v9: - Fix a bug to optimize probe when enabling. - Check nearby probes can be optimize/unoptimize when disarming/arming kprobes, instead of registering/unregistering. This will help kprobe-tracer because most of probes on it are usually disabled. Changes in v6: - Cleanup coding style for readability. - Add comments around get/put_online_cpus(). Changes in v5: - Use get_online_cpus()/put_online_cpus() for avoiding text_mutex deadlock. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133407.6725.81992.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * kprobes: Introduce generic insn_slot frameworkMasami Hiramatsu2010-02-251-39/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make insn_slot framework support various size slots. Current insn_slot just supports one-size instruction buffer slot. However, kprobes jump optimization needs larger size buffers. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133358.6725.82430.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
| * kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODEMasami Hiramatsu2010-02-252-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change RELATIVEJUMP_INSTRUCTION macro to RELATIVEJUMP_OPCODE since it represents just the opcode byte. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133349.6725.99302.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds2010-03-054-14/+23
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: padata: Allocate the cpumask for the padata instance crypto: authenc - Move saved IV in front of the ablkcipher request crypto: hash - Fix handling of unaligned buffers crypto: authenc - Use correct ahash complete functions crypto: md5 - Set statesize
| * | padata: Allocate the cpumask for the padata instanceSteffen Klassert2010-03-041-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | The cpumask of the padata instance was used without allocated. This caused boot crashes if CONFIG_CPUMASK_OFFSTACK is enabled. This patch fixes this by doing proper allocation for this cpumask. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * | crypto: authenc - Move saved IV in front of the ablkcipher requestSteffen Klassert2010-03-031-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In crypto_authenc_encrypt() we save the IV behind the ablkcipher request. To save space on the request, we overwrite the ablkcipher request with a ahash request after encryption. So the IV may be overwritten by the ahash request. This patch fixes this by placing the IV in front of the ablkcipher/ahash request. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * | crypto: hash - Fix handling of unaligned buffersSzilveszter Ördög2010-03-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The correct way to calculate the start of the aligned part of an unaligned buffer is: offset = ALIGN(offset, alignmask + 1); However, crypto_hash_walk_done() has: offset += alignmask - 1; offset = ALIGN(offset, alignmask + 1); which actually skips a whole block unless offset % (alignmask + 1) == 1. This patch fixes the problem. Signed-off-by: Szilveszter Ördög <slipszi@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * | crypto: authenc - Use correct ahash complete functionsSteffen Klassert2010-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We accidentally assigned the ahash update complete function to the wrong function pointer in crypto_authenc_verify. This patch fixes this. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
| * | crypto: md5 - Set statesizeHerbert Xu2010-03-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | As md5 now has export/import functions, it must set the attribute statesize. Otherwise anything that relies on import/export may fail as they will see a zero statesize. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | | Merge branch 'for_linus' of ↵Linus Torvalds2010-03-0527-657/+1344
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (36 commits) ext4: fix up rb_root initializations to use RB_ROOT ext4: Code cleanup for EXT4_IOC_MOVE_EXT ioctl ext4: Fix the NULL reference in double_down_write_data_sem() ext4: Fix insertion point of extent in mext_insert_across_blocks() ext4: consolidate in_range() definitions ext4: cleanup to use ext4_grp_offs_to_block() ext4: cleanup to use ext4_group_first_block_no() ext4: Release page references acquired in ext4_da_block_invalidatepages ext4: Fix ext4_quota_write cross block boundary behaviour ext4: Convert BUG_ON checks to use ext4_error() instead ext4: Use direct_IO_no_locking in ext4 dio read ext4: use ext4_get_block_write in buffer write ext4: mechanical rename some of the direct I/O get_block's identifiers ext4: make "offset" consistent in ext4_check_dir_entry() ext4: Handle non empty on-disk orphan link ext4: explicitly remove inode from orphan list after failed direct io ext4: fix error handling in migrate ext4: deprecate obsoleted mount options ext4: Fix fencepost error in chosing choosing group vs file preallocation. jbd2: clean up an assertion in jbd2_journal_commit_transaction() ...
| * | | ext4: fix up rb_root initializations to use RB_ROOTVenkatesh Pallipadi2010-03-043-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ext4 uses rb_node = NULL; to zero rb_root at few places. Using RB_ROOT as the initializer is more portable in case the underlying implementation of rbtrees changes in the future. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Eric Paris <eparis@redhat.com>
| * | | ext4: Code cleanup for EXT4_IOC_MOVE_EXT ioctlAkira Fujita2010-03-042-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a) Fix sparse warning in ext4_ioctl() b) Remove unneeded variable in mext_leaf_block() c) Fix spelling typo in mext_check_arguments() Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: Fix the NULL reference in double_down_write_data_sem()Akira Fujita2010-03-041-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If EXT4_IOC_MOVE_EXT ioctl is called with NULL donor_fd, fget() in ext4_ioctl() gets inappropriate file structure for donor; so we need to do this check earlier, before calling double_down_write_data_sem(). Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: Fix insertion point of extent in mext_insert_across_blocks()Akira Fujita2010-03-041-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the leaf node has 2 extent space or fewer and EXT4_IOC_MOVE_EXT ioctl is called with the file offset where after the 2nd extent covers, mext_insert_across_blocks() always tries to insert extent into the first extent. As a result, the file gets corrupted because of wrong extent order. The patch fixes this problem. Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: consolidate in_range() definitionsAkinobu Mita2010-03-034-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are duplicate macro definitions of in_range() in mballoc.h and balloc.c. This consolidates these two definitions into ext4.h, and changes extents.c to use in_range() as well. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger@sun.com>
| * | | ext4: cleanup to use ext4_grp_offs_to_block()Akinobu Mita2010-03-031-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More cleanup to convert open-coded calculations of the first block number of a free extent to use ext4_grp_offs_to_block() instead. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger@sun.com>
| * | | ext4: cleanup to use ext4_group_first_block_no()Akinobu Mita2010-03-034-28/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a cleanup and simplification patch which takes some open-coded calculations to calculate the first block number of a group and converts them to use the (already defined) ext4_group_first_block_no() function. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger@sun.com>
| * | | ext4: Release page references acquired in ext4_da_block_invalidatepagesJan Kara2010-03-031-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We forget to release page references we acquire in ext4_da_block_invalidatepages. Luckily, this function gets called only if we are not able to allocate blocks for delay-allocated data so that function should better never be called. Also cleanup handling of index variable. Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: Fix ext4_quota_write cross block boundary behaviourDmitry Monakhov2010-03-021-35/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We always assume what dquot update result in changes in one data block But ext4_quota_write() function may handle cross block boundary writes In fact if this ever happen it will result in incorrect journal credits reservation, and later a BUG_ON. As soon this never happen the boundary cross loop is NOOP. In order to make things straight let's remove this loop and assert cross boundary condition. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: Convert BUG_ON checks to use ext4_error() insteadFrank Mayhar2010-03-024-44/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert a bunch of BUG_ONs to emit a ext4_error() message and return EIO. This is a first pass and most notably does _not_ cover mballoc.c, which is a morass of void functions. Signed-off-by: Frank Mayhar <fmayhar@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: Use direct_IO_no_locking in ext4 dio readJiaying Zhang2010-03-021-1/+8
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: use ext4_get_block_write in buffer writeJiaying Zhang2010-03-045-55/+256
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocate uninitialized extent before ext4 buffer write and convert the extent to initialized after io completes. The purpose is to make sure an extent can only be marked initialized after it has been written with new data so we can safely drop the i_mutex lock in ext4 DIO read without exposing stale data. This helps to improve multi-thread DIO read performance on high-speed disks. Skip the nobh and data=journal mount cases to make things simple for now. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: mechanical rename some of the direct I/O get_block's identifiersJiaying Zhang2010-03-025-69/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit renames some of the direct I/O's block allocation flags, variables, and functions introduced in Mingming's "Direct IO for holes and fallocate" patches so that they can be used by ext4's buffered write path as well. Also changed the related function comments accordingly to cover both direct write and buffered write cases. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: make "offset" consistent in ext4_check_dir_entry()Toshiyuki Okajima2010-03-021-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The callers of ext4_check_dir_entry() usually pass in the "file offset" (ext4_readdir, htree_dirblock_to_tree, search_dirblock, ext4_dx_find_entry, empty_dir), but a few callers (add_dirent_to_buf, ext4_delete_entry) only pass in the buffer offset. To accomodate those last two (which would be hard to fix otherwise), this patch changes ext4_check_dir_entry() to print the physical block number and the relative offset as well as the passed-in offset. Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: Handle non empty on-disk orphan linkDmitry Monakhov2010-03-011-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of truncate errors we explicitly remove inode from in-core orphan list via orphan_del(NULL, inode) without modifying the on-disk list. But later on, the same inode may be inserted in the orphan list again which will result the on-disk linked list getting corrupted. If inode i_dtime contains valid value, then skip on-disk list modification. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: explicitly remove inode from orphan list after failed direct ioDmitry Monakhov2010-03-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise non-empty orphan list will be triggered on umount. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: fix error handling in migrateDmitry Monakhov2010-03-011-15/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set i_nlink to zero for temporary inode from very beginning. otherwise we may fail to start new journal handle and this inode will be unreferenced but with i_nlink == 1 Since we hold inode reference it can not be pruned. Also add missed journal_start retval check. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: deprecate obsoleted mount optionsDmitry Monakhov2010-03-011-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Declare following list of mount options as deprecated: - bsddf, miniddf - grpid, bsdgroups, nogrpid, sysvgroups Declare following list of default mount options as deprecated: - bsdgroups Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: Fix fencepost error in chosing choosing group vs file preallocation.Tao Ma2010-03-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ext4 multiblock allocator decides whether to use group or file preallocation based on the file size. When the file size reaches s_mb_stream_request (default is 16 blocks), it changes to use a file-specific preallocation. This is cool, but it has a tiny problem. See a simple script: mkfs.ext4 -b 1024 /dev/sda8 1000000 mount -t ext4 -o nodelalloc /dev/sda8 /mnt/ext4 for((i=0;i<5;i++)) do cat /mnt/4096>>/mnt/ext4/a #4096 is a file with 4096 characters. cat /mnt/4096>>/mnt/ext4/b done debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1 And you get BLOCKS: (0-14):8705-8719, (15):2356, (16-19):8465-8468 So there are 3 extents, a bit strange for the lonely 15th logical block. As we write to the 16 blocks, we choose file preallocation in ext4_mb_group_or_file, but in ext4_mb_normalize_request, we meet with the 16*1024 range, so no preallocation will be carried. file b then reserves the space after '2356', so when when write 16, we start from another part. This patch just change the check in ext4_mb_group_or_file, so that for the lonely 15 we will still use group preallocation. After the patch, we will get: debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1 BLOCKS: (0-15):8705-8720, (16-19):8465-8468 Looks more sane. Thanks. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | jbd2: clean up an assertion in jbd2_journal_commit_transaction()dingdinghua2010-02-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit_transaction has the same value as journal->j_running_transaction, so we can simplify the assert statement. Signed-off-by: dingdinghua <dingdinghua@nrchpc.ac.cn> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: trivial quota cleanupDmitry Monakhov2010-03-011-54/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch is aimed to reorganize and simplify quota code a bit. Quota code is itself complex enough, but we can make it more readable in some places: - Move quota option parsing to separate functions. - Simplify old-quota and journaled-quota mix check. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: mount flags manipulation cleanupDmitry Monakhov2010-02-241-18/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace intermediate EXT4_MOUNT_XXX flags manipulation to corresponding macro. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
OpenPOWER on IntegriCloud