summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* timekeeping: Update tk->cycle_last in resumeThomas Gleixner2013-04-221-1/+1
| | | | | | | | | | | | | | commit 7ec98e15aa (timekeeping: Delay update of clock->cycle_last) forgot to update tk->cycle_last in the resume path. This results in a stale value versus clock->cycle_last and prevents resume in the worst case. Reported-by: Jiri Slaby <jslaby@suse.cz> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Linux-pm mailing list <linux-pm@lists.linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304211648150.21884@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* posix-timers: Remove unused variableThomas Gleixner2013-04-181-1/+0
| | | | | | | | Remove the unused variable *node introduced by commit 5ed67f05 (posix timers: Allocate timer id per process) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Pavel Emelyanov <xemul@parallels.com>
* clockevents: Switch into oneshot mode even if broadcast registered lateStephen Boyd2013-04-171-0/+10
| | | | | | | | | | | | | | | | | | | | | | | tick_oneshot_notify() is used to notify a particular CPU to try to switch into oneshot mode after a oneshot capable tick device is registered and tick_clock_notify() is used to notify all CPUs to try to switch into oneshot mode after a high res clocksource is registered. There is one caveat; if the tick devices suffer from FEAT_C3_STOP we don't try to switch into oneshot mode unless we have a oneshot capable broadcast device already registered. If the broadcast device is registered after the tick devices that have FEAT_C3_STOP we'll never try to switch into oneshot mode again, causing us to be stuck in periodic mode forever. Avoid this scenario by calling tick_clock_notify() after we register the broadcast device so that we try to switch into oneshot mode on all CPUs one more time. [ tglx: Adopted to timers/core and added a comment ] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Link: http://lkml.kernel.org/r/1366219566-29783-1-git-send-email-sboyd@codeaurora.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* timer_list: Convert timer list to be a proper seq_fileNathan Zimmer2013-04-171-12/+77
| | | | | | | | | | | | | | | | | | | | | | When running with 4096 cores attemping to read /proc/timer_list will fail with an ENOMEM condition. On a sufficantly large systems the total amount of data is more then 4mb, so it won't fit into a single buffer. The failure can also occur on smaller systems when memory fragmentation is high as reported by Dave Jones. Convert /proc/timer_list to a proper seq_file with its own iterator. This is a little more complex given that we have to make two passes with two separate headers. sysrq_timer_list_show also needed to be updated to reflect the fact that now timer_list_show only does one cpu at at time. Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Reported-by: Dave Jones <davej@redhat.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Link: http://lkml.kernel.org/r/1364345790-14577-3-git-send-email-nzimmer@sgi.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* timer_list: Split timer_list_show_tickdevicesNathan Zimmer2013-04-171-12/+9
| | | | | | | | | | | | | Split timer_list_show_tickdevices() into the header printout and pull the rest up to timer_list_show. This is a preparatory patch for converting timer_list to a proper seqfile with its own iterator Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Reported-by: Dave Jones <davej@redhat.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Link: http://lkml.kernel.org/r/1364345790-14577-2-git-send-email-nzimmer@sgi.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* posix-timers: Show sigevent info in proc filePavel Emelyanov2013-04-171-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | Previous patch added proc file to list posix timers created by task. Expand the information provided in this file by adding info about notification method, with which timers were created. I.e. after the "ID:" line there go 1. "signal:" line, that shows signal number and sigval bits; 2. "notify:" line, that shows the timer notification method. Thus the timer entry would looke like this: ID: 123 signal: 14/0000000000b005d0 notify: signal/pid.732 This information is enough to understand how timer_create() was called for each particular timer. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Link: http://lkml.kernel.org/r/513DA024.80404@parallels.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* posix-timers: Introduce /proc/PID/timers filePavel Emelyanov2013-04-171-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently kernel doesn't provide any API for getting info about what posix timers are configured by processes. It's implied, that a process which configured some timers, knows what it did. However, for external tools it's impossible to get this information. In particular, this is critical for checkpoint-restore project to have this info. Introduce a per-pid proc file with information about posix timers. Since these timers are shared between threads, this file is present on tgid level only, no such thing in tid subdirs. The file format is expected to be the "/proc/<pid>/smaps"-like, i.e. each timer will occupy seveal lines to allow for future extending. Each new timer entry starts with the ID: <number> line which is added by this patch. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Link: http://lkml.kernel.org/r/513DA00D.6070009@parallels.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* posix timers: Allocate timer id per process (v2)Pavel Emelyanov2013-04-173-38/+72
| | | | | | | | | | | | | | | | | | | | | | Currently kernel generates IDs for posix timers in a global manner -- there's a kernel-wide IDR tree from which IDs are created. This makes it impossible to recreate a timer with a desired ID (in particular this is done by the CRIU checkpoint-restore project) -- since these IDs are global it may happen, that at the time we recreate a timer, the ID we want for it is already busy by some other timer. In order to address this, replace the IDR tree with a global hash table for timers and makes timer IDs unique per signal_struct (to which timers are linked anyway). With this, two timers belonging to different processes may have equal IDs and we can recreate either of them with the ID we want. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Matthew Helsley <matt.helsley@gmail.com> Link: http://lkml.kernel.org/r/513D9FF5.9010004@parallels.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* timekeeping: Make sure to notify hrtimers when TAI offset changesJohn Stultz2013-04-111-3/+7
| | | | | | | | | Now that we have CLOCK_TAI timers, make sure we notify hrtimer code when TAI offset is changed. Signed-off-by: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1365622909-953-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* hrtimer: Fix ktime_add_ns() overflow on 32bit architecturesDavid Engraf2013-04-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One can trigger an overflow when using ktime_add_ns() on a 32bit architecture not supporting CONFIG_KTIME_SCALAR. When passing a very high value for u64 nsec, e.g. 7881299347898368000 the do_div() function converts this value to seconds (7881299347) which is still to high to pass to the ktime_set() function as long. The result in is a negative value. The problem on my system occurs in the tick-sched.c, tick_nohz_stop_sched_tick() when time_delta is set to timekeeping_max_deferment(). The check for time_delta < KTIME_MAX is valid, thus ktime_add_ns() is called with a too large value resulting in a negative expire value. This leads to an endless loop in the ticker code: time_delta: 7881299347898368000 expires = ktime_add_ns(last_update, time_delta) expires: negative value This fix caps the value to KTIME_MAX. This error doesn't occurs on 64bit or architectures supporting CONFIG_KTIME_SCALAR (e.g. ARM, x86-32). Cc: stable@vger.kernel.org Signed-off-by: David Engraf <david.engraf@sysgo.com> [jstultz: Minor tweaks to commit message & header] Signed-off-by: John Stultz <john.stultz@linaro.org>
* hrtimer: Add expiry time overflow check in hrtimer_interruptPrarit Bhargava2013-04-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The settimeofday01 test in the LTP testsuite effectively does gettimeofday(current time); settimeofday(Jan 1, 1970 + 100 seconds); settimeofday(current time); This test causes a stack trace to be displayed on the console during the setting of timeofday to Jan 1, 1970 + 100 seconds: [ 131.066751] ------------[ cut here ]------------ [ 131.096448] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x135/0x140() [ 131.104935] Hardware name: Dinar [ 131.108150] Modules linked in: sg nfsv3 nfs_acl nfsv4 auth_rpcgss nfs dns_resolver fscache lockd sunrpc nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables kvm_amd kvm sp5100_tco bnx2 i2c_piix4 crc32c_intel k10temp fam15h_power ghash_clmulni_intel amd64_edac_mod pcspkr serio_raw edac_mce_amd edac_core microcode xfs libcrc32c sr_mod sd_mod cdrom ata_generic crc_t10dif pata_acpi radeon i2c_algo_bit drm_kms_helper ttm drm ahci pata_atiixp libahci libata usb_storage i2c_core dm_mirror dm_region_hash dm_log dm_mod [ 131.176784] Pid: 0, comm: swapper/28 Not tainted 3.8.0+ #6 [ 131.182248] Call Trace: [ 131.184684] <IRQ> [<ffffffff810612af>] warn_slowpath_common+0x7f/0xc0 [ 131.191312] [<ffffffff8106130a>] warn_slowpath_null+0x1a/0x20 [ 131.197131] [<ffffffff810b9fd5>] clockevents_program_event+0x135/0x140 [ 131.203721] [<ffffffff810bb584>] tick_program_event+0x24/0x30 [ 131.209534] [<ffffffff81089ab1>] hrtimer_interrupt+0x131/0x230 [ 131.215437] [<ffffffff814b9600>] ? cpufreq_p4_target+0x130/0x130 [ 131.221509] [<ffffffff81619119>] smp_apic_timer_interrupt+0x69/0x99 [ 131.227839] [<ffffffff8161805d>] apic_timer_interrupt+0x6d/0x80 [ 131.233816] <EOI> [<ffffffff81099745>] ? sched_clock_cpu+0xc5/0x120 [ 131.240267] [<ffffffff814b9ff0>] ? cpuidle_wrap_enter+0x50/0xa0 [ 131.246252] [<ffffffff814b9fe9>] ? cpuidle_wrap_enter+0x49/0xa0 [ 131.252238] [<ffffffff814ba050>] cpuidle_enter_tk+0x10/0x20 [ 131.257877] [<ffffffff814b9c89>] cpuidle_idle_call+0xa9/0x260 [ 131.263692] [<ffffffff8101c42f>] cpu_idle+0xaf/0x120 [ 131.268727] [<ffffffff815f8971>] start_secondary+0x255/0x257 [ 131.274449] ---[ end trace 1151a50552231615 ]--- When we change the system time to a low value like this, the value of timekeeper->offs_real will be a negative value. It seems that the WARN occurs because an hrtimer has been started in the time between the releasing of the timekeeper lock and the IPI call (via a call to on_each_cpu) in clock_was_set() in the do_settimeofday() code. The end result is that a REALTIME_CLOCK timer has been added with softexpires = expires = KTIME_MAX. The hrtimer_interrupt() fires/is called and the loop at kernel/hrtimer.c:1289 is executed. In this loop the code subtracts the clock base's offset (which was set to timekeeper->offs_real in do_settimeofday()) from the current hrtimer_cpu_base->expiry value (which was KTIME_MAX): KTIME_MAX - (a negative value) = overflow A simple check for an overflow can resolve this problem. Using KTIME_MAX instead of the overflow value will result in the hrtimer function being run, and the reprogramming of the timer after that. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> [jstultz: Tweaked commit subject] Signed-off-by: John Stultz <john.stultz@linaro.org>
* timekeeping: Shorten seq_count regionThomas Gleixner2013-04-041-3/+2
| | | | | | | | | | | Shorten the seqcount write hold region to the actual update of the timekeeper and the related data (e.g vsyscall). On a contemporary x86 system this reduces the maximum latencies on Preempt-RT from 8us to 4us on the non-timekeeping cores. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
* timekeeping: Implement a shadow timekeeperThomas Gleixner2013-04-041-12/+29
| | | | | | | | | | | | | | Use the shadow timekeeper to do the update_wall_time() adjustments and then copy it over to the real timekeeper. Keep the shadow timekeeper in sync when updating stuff outside of update_wall_time(). This allows us to limit the timekeeper_seq hold time to the update of the real timekeeper and the vsyscall data in the next patch. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
* timekeeping: Delay update of clock->cycle_lastThomas Gleixner2013-04-041-1/+3
| | | | | | | | | For calculating the new timekeeper values store the new cycle_last value in the timekeeper and update the clock->cycle_last just when we actually update the new values. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
* timekeeping: Store cycle_last value in timekeeper struct as wellThomas Gleixner2013-04-042-2/+4
| | | | | | | | | | | For implementing a shadow timekeeper and a split calculation/update region we need to store the cycle_last value in the timekeeper and update the value in the clocksource struct only in the update region. Add the extra storage to the timekeeper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
* ntp: Remove ntp_lock, using the timekeeping locks to protect ntp stateJohn Stultz2013-04-041-37/+4
| | | | | | | | | | | | | In order to properly handle the NTP state in future changes to the timekeeping lock management, this patch moves the management of all of the ntp state under the timekeeping locks. This allows us to remove the ntp_lock. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* timekeeping: Simplify tai updating from do_adjtimexJohn Stultz2013-04-041-5/+4
| | | | | | | | | | | Since we are taking the timekeeping locks, just go ahead and update any tai change directly, rather then dropping the lock and calling a function that will just take it again. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* timekeeping: Hold timekeepering locks in do_adjtimex and hardppsJohn Stultz2013-04-041-2/+17
| | | | | | | | | | | In moving the NTP state to be protected by the timekeeping locks, be sure to acquire the timekeeping locks prior to calling ntp functions. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* timekeeping: Move ADJ_SETOFFSET to top level do_adjtimex()John Stultz2013-04-042-11/+11
| | | | | | | | | | | | | | Since ADJ_SETOFFSET adjusts the timekeeping state, process it as part of the top level do_adjtimex() function in timekeeping.c. This avoids deadlocks that could occur once we change the ntp locking rules. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* ntp: Rework do_adjtimex to take timespec and tai argumentsJohn Stultz2013-04-043-16/+17
| | | | | | | | | | | In order to change the locking rules, we need to provide the timespec and tai values rather then having the ntp logic acquire these values itself. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* ntp: Move timex validation to timekeeping do_adjtimex call.John Stultz2013-04-043-5/+8
| | | | | | | | | | Move logic that does not need the ntp state to be done in the timekeeping do_adjtimex() call. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* ntp: Move do_adjtimex() and hardpps() functions to timekeeping.cJohn Stultz2013-04-044-12/+36
| | | | | | | | | | | | | | | In preparation for changing the ntp locking rules, move do_adjtimex and hardpps accessor functions to timekeeping.c, but keep the code logic in ntp.c. This patch also introduces a ntp_internal.h file so timekeeping specific interfaces of ntp.c can be more limitedly shared with timekeeping.c. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* ntp: Split out timex validation from do_adjtimexJohn Stultz2013-04-041-12/+27
| | | | | | | | | | Split out the timex validation done in do_adjtimex into a separate function. This will help simplify logic in following patches. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
* Merge branch 'fortglx/3.10/time' of ↵Thomas Gleixner2013-04-03130-968/+1676
|\ | | | | | | git://git.linaro.org/people/jstultz/linux into timers/core
| * ARM: bcm281xx: Add timer driver (driver portion)Christian Daudt2013-03-284-5/+215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the Broadcom timer, used in the following SoCs: BCM11130, BCM11140, BCM11351, BCM28145, BCM28155 Updates from V6: - Split DT portion into a separate patch Updates from V5: - Rebase to latest arm-soc/for-next Updates from V4: - Switch code to use CLOCKSOURCE_OF_DECLARE Updates from V3: - Migrate to 3.9 timer framework updates Updates from V2: - prepend static fns + fields with kona_ Updates from V1: - Rename bcm_timer.c to bcm_kona_timer.c - Pull .h into bcm_kona_timer.c - Make timers static - Clean up comment block - Switched to using clockevents_config_and_register - Added an error to the get_timer loop if it repeats too much - Added to Documentation/devicetree/bindings/arm/bcm/bcm,kona-timer.txt - Added missing readl to timer_disable_and_clear Note: bcm,kona-timer was kept as the 'compatible' field to make it specific enough for when there are multiple bcm timers (bcm,timer is too generic). Signed-off-by: Christian Daudt <csd@broadcom.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: __timekeeping_set_tai_offset can be staticFengguang Wu2013-03-251-1/+1
| | | | | | | | | | | | | | | | | | Yet again, the kbuild test robot saves the day, noting I left out defining __timekeeping_set_tai_offset as static. It even sent me this patch. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Split timekeeper_lock into lock and seqcountThomas Gleixner2013-03-221-59/+73
| | | | | | | | | | | | | | | | | | | | | | | | We want to shorten the seqcount write hold time. So split the seqlock into a lock and a seqcount. Open code the seqwrite_lock in the places which matter and drop the sequence counter update where it's pointless. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [jstultz: Merge fixups from CLOCK_TAI collisions] Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Move lock out of timekeeper structThomas Gleixner2013-03-222-57/+53
| | | | | | | | | | | | | | | | | | Make the lock a separate entity. Preparatory patch for shadow timekeeper structure. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [Merged with CLOCK_TAI changes] Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Make jiffies_lock internalThomas Gleixner2013-03-223-1/+3
| | | | | | | | | | | | | | Nothing outside of the timekeeping core needs that lock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Calc stuff onceThomas Gleixner2013-03-221-3/+4
| | | | | | | | | | | | | | | | Calculate the cycle interval shifted value once. No functional change, just makes the code more readable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * hrtimer: Add hrtimer support for CLOCK_TAIJohn Stultz2013-03-225-3/+44
| | | | | | | | | | | | Add hrtimer support for CLOCK_TAI, as well as posix timer interfaces. Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Add CLOCK_TAI clockidJohn Stultz2013-03-224-4/+43
| | | | | | | | | | | | | | | | | | This add a CLOCK_TAI clockid and the needed accessors. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Move TAI managment into timekeeping core from ntpJohn Stultz2013-03-224-8/+59
| | | | | | | | | | | | | | | | | | | | | | Currently NTP manages the TAI offset. Since there's plans for a CLOCK_TAI clockid, push the TAI management into the timekeeping core. CC: Thomas Gleixner <tglx@linutronix.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Richard Cochran <richardcochran@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: utilize the suspend-nonstop clocksource to count suspended timeFeng Tang2013-03-151-7/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some new processors whose TSC clocksource won't stop during suspend. Currently, after system resumes, kernel will use persistent clock or RTC to compensate the sleep time, but with these nonstop clocksources, we could skip the special compensation from external sources, and just use current clocksource for time recounting. This can solve some time drift bugs caused by some not-so-accurate or error-prone RTC devices. The current way to count suspended time is first try to use the persistent clock, and then try the RTC if persistent clock can't be used. This patch will change the trying order to: suspend-nonstop clocksource -> persistent clock -> RTC When counting the sleep time with nonstop clocksource, use an accurate way suggested by Jason Gunthorpe to cover very large delta cycles. Signed-off-by: Feng Tang <feng.tang@intel.com> [jstultz: Small optimization, avoiding re-reading the clocksource] Signed-off-by: John Stultz <john.stultz@linaro.org>
| * x86: tsc: Add support for new S3_NONSTOP featureFeng Tang2013-03-151-1/+5
| | | | | | | | | | | | | | Add support for new S3_NONSTOP feature Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * clocksource: Add new feature flag CLOCK_SOURCE_SUSPEND_NONSTOPFeng Tang2013-03-151-0/+1
| | | | | | | | | | | | | | | | | | | | Some x86 processors have a TSC clocksource, which continues to run even when system is suspended. Also most OMAP platforms have a 32 KHz timer which has similar capability. Add a feature flag so that it could be utilized. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * x86: Add cpu capability flag X86_FEATURE_NONSTOP_TSC_S3Feng Tang2013-03-152-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | On some new Intel Atom processors (Penwell and Cloverview), there is a feature that the TSC won't stop in S3 state, say the TSC value won't be reset to 0 after resume. This feature makes TSC a more reliable clocksource and could benefit the timekeeping code during system suspend/resume cycle, so add a flag for it. Signed-off-by: Feng Tang <feng.tang@intel.com> [jstultz: Fix checkpatch warning] Signed-off-by: John Stultz <john.stultz@linaro.org>
| * x86: Do full rtc synchronization with ntpPrarit Bhargava2013-03-154-83/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every 11 minutes ntp attempts to update the x86 rtc with the current system time. Currently, the x86 code only updates the rtc if the system time is within +/-15 minutes of the current value of the rtc. This was done originally to avoid setting the RTC if the RTC was in localtime mode (common with Windows dualbooting). Other architectures do a full synchronization and now that we have better infrastructure to detect when the RTC is in localtime, there is no reason that x86 should be software limited to a 30 minute window. This patch changes the behavior of the kernel to do a full synchronization (year, month, day, hour, minute, and second) of the rtc when ntp requests a synchronization between the system time and the rtc. I've used the RTC library functions in this patchset as they do all the required bounds checking. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: x86@kernel.org Cc: Matt Fleming <matt.fleming@intel.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: linux-efi@vger.kernel.org Signed-off-by: Prarit Bhargava <prarit@redhat.com> [jstultz: Tweak commit message, fold in build fix found by fengguang Also add select RTC_LIB to X86, per new dependency, as found by prarit] Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Use inject_offset in warp_clockJohn Stultz2013-03-151-3/+3
| | | | | | | | | | | | | | | | | | | | When warping the clock (from a local time RTC), use timekeeping_inject_offset() to atomically add the offset. This avoids any minor time error caused by the delay between reading the time, and then setting the adjusted time. Signed-off-by: John Stultz <john.stultz@linaro.org>
| * timekeeping: Avoid adjust kernel time once hwclock kept in UTC timeDong Zhu2013-03-151-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | If the Hardware Clock kept in local time,kernel will adjust the time to be UTC time.But if Hardware Clock kept in UTC time,system will make a dummy settimeofday call first (sys_tz.tz_minuteswest = 0) to make sure the time is not shifted,so at this point I think maybe it is not necessary to set the kernel time once the sys_tz.tz_minuteswest is zero. Signed-off-by: Dong Zhu <bluezhudong@gmail.com> [jstultz: Updated to merge with conflicting changes ] Signed-off-by: John Stultz <john.stultz@linaro.org>
| * Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2013-03-114-14/+40
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc minor fixes mostly related to tracing" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: s390: Fix a header dependencies related build error tracing: update documentation of snapshot utility tracing: Do not return EINVAL in snapshot when not allocated tracing: Add help of snapshot feature when snapshot is empty ftrace: Update the kconfig for DYNAMIC_FTRACE
| | * s390: Fix a header dependencies related build errorLi Zefan2013-03-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 877c685607925238e302cd3aa38788dca6c1b226 ("perf: Remove include of cgroup.h from perf_event.h") caused this build failure if PERF_EVENTS is enabled: In file included from arch/s390/include/asm/perf_event.h:9:0, from include/linux/perf_event.h:24, from kernel/events/ring_buffer.c:12: arch/s390/include/asm/cpu_mf.h: In function 'qctri': arch/s390/include/asm/cpu_mf.h:61:12: error: 'EINVAL' undeclared (first use in this function) cpu_mf.h had an implicit errno.h dependency, which was added indirectly via cgroups.h but not anymore. Add it explicitly. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Link: http://lkml.kernel.org/r/51385F79.7000106@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * tracing: update documentation of snapshot utilityHiraku Toyooka2013-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, "snapshot" file returns success on a reset of snapshot buffer even if the buffer wasn't allocated, instead of returning EINVAL. This patch updates snapshot desctiption according to the change. Link: http://lkml.kernel.org/r/51399409.4090207@hitachi.com Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * tracing: Do not return EINVAL in snapshot when not allocatedSteven Rostedt (Red Hat)2013-03-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To use the tracing snapshot feature, writing a '1' into the snapshot file causes the snapshot buffer to be allocated if it has not already been allocated and dose a 'swap' with the main buffer, so that the snapshot now contains what was in the main buffer, and the main buffer now writes to what was the snapshot buffer. To free the snapshot buffer, a '0' is written into the snapshot file. To clear the snapshot buffer, any number but a '0' or '1' is written into the snapshot file. But if the file is not allocated it returns -EINVAL error code. This is rather pointless. It is better just to do nothing and return success. Acked-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * tracing: Add help of snapshot feature when snapshot is emptySteven Rostedt (Red Hat)2013-03-071-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When cat'ing the snapshot file, instead of showing an empty trace header like the trace file does, show how to use the snapshot feature. Also, this is a good place to show if the snapshot has been allocated or not. Users may want to "pre allocate" the snapshot to have a fast "swap" of the current buffer. Otherwise, a swap would be slow and might fail as it would need to allocate the snapshot buffer, and that might fail under tight memory constraints. Here's what it looked like before: # tracer: nop # # entries-in-buffer/entries-written: 0/0 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | Here's what it looks like now: # tracer: nop # # # * Snapshot is freed * # # Snapshot commands: # echo 0 > snapshot : Clears and frees snapshot buffer # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. # Takes a snapshot of the main buffer. # echo 2 > snapshot : Clears snapshot buffer (but does not allocate) # (Doesn't have to be '2' works with any number that # is not a '0' or '1') Acked-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2013-02-281-10/+14
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent Pull an ftrace Kconfig help text fix from Steve Rostedt. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | | * ftrace: Update the kconfig for DYNAMIC_FTRACESteven Rostedt2013-02-271-10/+14
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The prompt to enable DYNAMIC_FTRACE (the ability to nop and enable function tracing at run time) had a confusing statement: "enable/disable ftrace tracepoints dynamically" This was written before tracepoints were added to the kernel, but now that tracepoints have been added, this is very confusing and has confused people enough to give wrong information during presentations. Not only that, I looked at the help text, and it still references that dreaded daemon that use to wake up once a second to update the nop locations and brick NICs, that hasn't been around for over five years. Time to bring the text up to the current decade. Cc: stable@vger.kernel.org Reported-by: Ezequiel Garcia <elezegarcia@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-03-1184-696/+843
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Missing cancel of work items in mac80211 MLME, from Ben Greear. 2) Fix DMA mapping handling in iwlwifi by using coherent DMA for command headers, from Johannes Berg. 3) Decrease the amount of pressure on the page allocator by using order 1 pages less in iwlwifi, from Emmanuel Grumbach. 4) Fix mesh PS broadcast OOPS in mac80211, from Marco Porsch. 5) Don't forget to recalculate idle state in mac80211 monitor interface, from Felix Fietkau. 6) Fix varargs in netfilter conntrack handler, from Joe Perches. 7) Need to reset entire chip when command queue fills up in iwlwifi, from Emmanuel Grumbach. 8) The TX antenna value must be valid when calibrations are performed in iwlwifi, fix from Dor Shaish. 9) Don't generate netfilter audit log entries when audit is disabled, from Gao Feng. 10) Deal with DMA unit hang on e1000e during power state transitions, from Bruce Allan. 11) Remove BUILD_BUG_ON check from igb driver, from Alexander Duyck. 12) Fix lockdep warning on i2c handling of igb driver, from Carolyn Wyborny. 13) Fix several TTY handling issues in IRDA ircomm tty driver, from Peter Hurley. 14) Several QFQ packet scheduler fixes from Paolo Valente. 15) When VXLAN encapsulates on transmit, we have to reset the netfilter state. From Zang MingJie. 16) Fix jiffie check in net_rx_action() so that we really cap the processing at 2HZ. From Eric Dumazet. 17) Fix erroneous trigger of IP option space exhaustion, when routers are pre-specified and we are looking to see if we can insert a timestamp, we will have the space. From David Ward. 18) Fix various issues in benet driver wrt waiting for firmware to finish POST after resets or errors. From Gavin Shan and Sathya Perla. 19) Fix TX locking in SFC driver, from Ben Hutchings. 20) Like the VXLAN fix above, when we encap in a TUN device we have to reset the netfilter state. This should fix several strange crashes reported by Dave Jones and others. From Eric Dumazet. 21) Don't forget to clean up MAC address resources when shutting down a port in mlx4 driver, from Yan Burman. 22) Fix divide by zero in vmxnet3 driver, from Bhavesh Davda. 23) Fix device statistic regression in tg3 when the driver is using phylib, from Nithin Sujir. 24) Fix info leak in several netlink handlers, from Mathias Krause. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (79 commits) 6lowpan: Fix endianness issue in is_addr_link_local(). rrunner.c: fix possible memory leak in rr_init_one() dcbnl: fix various netlink info leaks rtnl: fix info leak on RTM_GETLINK request for VF devices bridge: fix mdb info leaks tg3: Update link_up flag for phylib devices ipv6: stop multicast forwarding to process interface scoped addresses bridging: fix rx_handlers return code netlabel: fix build problems when CONFIG_IPV6=n drivers/isdn: checkng length to be sure not memory overflow net/rds: zero last byte for strncpy bnx2x: Fix SFP+ misconfiguration in iSCSI boot scenario bnx2x: Fix intermittent long KR2 link up time macvlan: Set IFF_UNICAST_FLT flag to prevent unnecessary promisc mode. team: unsyc the devices addresses when port is removed bridge: add missing vid to br_mdb_get() Fix: sparse warning in inet_csk_prepare_forced_close afkey: fix a typo MAINTAINERS: Update qlcnic maintainers list netlabel: correctly list all the static label mappings ...
| | * | 6lowpan: Fix endianness issue in is_addr_link_local().YOSHIFUJI Hideaki / 吉藤英明2013-03-101-1/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | rrunner.c: fix possible memory leak in rr_init_one()David Oostdyk2013-03-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the event that register_netdev() failed, the rrpriv->evt_ring allocation would have not been freed. Signed-off-by: David Oostdyk <daveo@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud