summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'timers-for-linus' of ↵Linus Torvalds2010-08-065-12/+141
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: Documentation: Add timers/timers-howto.txt timer: Added usleep_range timer Revert "timer: Added usleep[_range] timer" clockevents: Remove the per cpu tick skew posix_timer: Move copy_to_user(created_timer_id) down in timer_create() timer: Added usleep[_range] timer timers: Document meaning of deferrable timer
| * Documentation: Add timers/timers-howto.txtPatrick Pannuto2010-08-041-0/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file seeks to explain the nuances in various delays; many driver writers are not necessarily familiar with the various kernel timers, their shortfalls, and quirks. When faced with ndelay, udelay, mdelay, usleep_range, msleep, and msleep_interrubtible the question "How do I just wait 1 ms for my hardware to latch?" has the non-intuitive "best" answer: usleep_range(1000,1500) This patch is followed by a series of checkpatch additions that seek to help kernel hackers pick the best delay. Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org> Cc: apw@canonical.com Cc: corbet@lwn.net Cc: arjan@linux.intel.com Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1280786467-26999-3-git-send-email-ppannuto@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * timer: Added usleep_range timerPatrick Pannuto2010-08-042-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | usleep_range is a finer precision implementations of msleep and is designed to be a drop-in replacement for udelay where a precise sleep / busy-wait is unnecessary. Since an easy interface to hrtimers could lead to an undesired proliferation of interrupts, we provide only a "range" API, forcing the caller to think about an acceptable tolerance on both ends and hopefully avoiding introducing another interrupt. INTRO As discussed here ( http://lkml.org/lkml/2007/8/3/250 ), msleep(1) is not precise enough for many drivers (yes, sleep precision is an unfair notion, but consistently sleeping for ~an order of magnitude greater than requested is worth fixing). This patch adds a usleep API so that udelay does not have to be used. Obviously not every udelay can be replaced (those in atomic contexts or being used for simple bitbanging come to mind), but there are many, many examples of mydriver_write(...) /* Wait for hardware to latch */ udelay(100) in various drivers where a busy-wait loop is neither beneficial nor necessary, but msleep simply does not provide enough precision and people are using a busy-wait loop instead. CONCERNS FROM THE RFC Why is udelay a problem / necessary? Most callers of udelay are in device/ driver initialization code, which is serial... As I see it, there is only benefit to sleeping over a delay; the notion of "refactoring" areas that use udelay was presented, but I see usleep as the refactoring. Consider i2c, if the bus is busy, you need to wait a bit (say 100us) before trying again, your current options are: * udelay(100) * msleep(1) <-- As noted above, actually as high as ~20ms on some platforms, so not really an option * Manually set up an hrtimer to try again in 100us (which is what usleep does anyway...) People choose the udelay route because it is EASY; we need to provide a better easy route. Device / driver / boot code is *currently* serial, but every few months someone makes noise about parallelizing boot, and IMHO, a little forward-thinking now is one less thing to worry about if/when that ever happens udelay's could be preempted Sure, but if udelay plans on looping 1000 times, and it gets preempted on loop 200, whenever it's scheduled again, it is going to do the next 800 loops. Is the interruptible case needed? Probably not, but I see usleep as a very logical parallel to msleep, so it made sense to include the "full" API. Processors are getting faster (albeit not as quickly as they are becoming more parallel), so if someone wanted to be interruptible for a few usecs, why not let them? If this is a contentious point, I'm happy to remove it. OTHER THOUGHTS I believe there is also value in exposing the usleep_range option; it gives the scheduler a lot more flexibility and allows the programmer to express his intent much more clearly; it's something I would hope future driver writers will take advantage of. To get the results in the NUMBERS section below, I literally s/udelay/usleep the kernel tree; I had to go in and undo the changes to the USB drivers, but everything else booted successfully; I find that extremely telling in and of itself -- many people are using a delay API where a sleep will suit them just fine. SOME ATTEMPTS AT NUMBERS It turns out that calculating quantifiable benefit on this is challenging, so instead I will simply present the current state of things, and I hope this to be sufficient: How many udelay calls are there in 2.6.35-rc5? udealy(ARG) >= | COUNT 1000 | 319 500 | 414 100 | 1146 20 | 1832 I am working on Android, so that is my focus for this. The following table is a modified usleep that simply printk's the amount of time requested to sleep; these tests were run on a kernel with udelay >= 20 --> usleep "boot" is power-on to lock screen "power collapse" is when the power button is pushed and the device suspends "resume" is when the power button is pushed and the lock screen is displayed (no touchscreen events or anything, just turning on the display) "use device" is from the unlock swipe to clicking around a bit; there is no sd card in this phone, so fail loading music, video, camera ACTION | TOTAL NUMBER OF USLEEP CALLS | NET TIME (us) boot | 22 | 1250 power-collapse | 9 | 1200 resume | 5 | 500 use device | 59 | 7700 The most interesting category to me is the "use device" field; 7700us of busy-wait time that could be put towards better responsiveness, or at the least less power usage. Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org> Cc: apw@canonical.com Cc: corbet@lwn.net Cc: arjan@linux.intel.com Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * Revert "timer: Added usleep[_range] timer"Thomas Gleixner2010-08-042-28/+0
| | | | | | | | | | | | | | This reverts commit 22b8f15c2f7130bb0386f548428df2ffd4e81903 to merge an advanced version. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * clockevents: Remove the per cpu tick skewArjan van de Ven2010-08-021-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically, Linux has tried to make the regular timer tick on the various CPUs not happen at the same time, to avoid contention on xtime_lock. Nowadays, with the tickless kernel, this contention no longer happens since time keeping and updating are done differently. In addition, this skew is actually hurting power consumption in a measurable way on many-core systems. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20100727210210.58d3118c@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * posix_timer: Move copy_to_user(created_timer_id) down in timer_create()Andrey Vagin2010-07-231-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to Oleg Nesterov: We can move copy_to_user(created_timer_id) down after "if (timer_event_spec)" block too. (but before CLOCK_DISPATCH(), of course). Signed-off-by: Andrey Vagin <avagin@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Andrey Vagin <avagin@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * timer: Added usleep[_range] timerPatrick Pannuto2010-07-232-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | usleep[_range] are finer precision implementations of msleep and are designed to be drop-in replacements for udelay where a precise sleep / busy-wait is unnecessary. They also allow an easy interface to specify slack when a precise (ish) wakeup is unnecessary to help minimize wakeups Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org> Cc: akinobu.mita@gmail.com Cc: sboyd@codeaurora.org Acked-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <4C44CDD2.1070708@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * timers: Document meaning of deferrable timerJ. Bruce Fields2010-07-231-2/+7
| | | | | | | | | | | | | | | | | | | | | | Steal some text from 6e453a67510 "Add support for deferrable timers". A reader shouldn't have to dig through the git logs for the basic description of a deferrable timer. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: johnstul@us.ibm.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86, kvm: Remove cast obsoleted by set_64bit() prototype cleanupH. Peter Anvin2010-08-061-5/+1
| | | | | | | | | | | | | | | | | | | | | | KVM ended up having to put a pretty ugly wrapper around set_64bit() in order to get the type right. Now set_64bit() takes the expected u64 type, and this wrapper can be cleaned up. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Avi Kivity <avi@redhat.com> LKML-Reference: <4C5C4E7A.8040603@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6Linus Torvalds2010-08-0697-3195/+1197
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6: pcmcia: avoid buffer overflow in pcmcia_setup_isa_irq pcmcia: do not request windows if you don't need to pcmcia: insert PCMCIA device resources into resource tree pcmcia: export resource information to sysfs pcmcia: use struct resource for PCMCIA devices, part 2 pcmcia: remove memreq_t pcmcia: move local definitions out of include/pcmcia/cs.h pcmcia: do not use io_req_t when calling pcmcia_request_io() pcmcia: do not use io_req_t after call to pcmcia_request_io() pcmcia: use struct resource for PCMCIA devices pcmcia: clean up cs.h pcmcia: use pcmica_{read,write}_config_byte pcmcia: remove cs_types.h pcmcia: remove unused flag, simplify headers pcmcia: remove obsolete CS_EVENT_ definitions pcmcia: split up central event handler pcmcia: simplify event callback pcmcia: remove obsolete ioctl Conflicts in: - drivers/staging/comedi/drivers/* - drivers/staging/wlags49_h2/wl_cs.c due to dev_info_t and whitespace changes
| * | pcmcia: avoid buffer overflow in pcmcia_setup_isa_irqDominik Brodowski2010-08-031-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NR_IRQS may be as low as 16, causing a (harmless?) buffer overflow in pcmcia_setup_isa_irq(): static u8 pcmcia_used_irq[NR_IRQS]; ... if ((try < 32) && pcmcia_used_irq[irq]) continue; This is read-only, so if this address would be non-zero, it would just mean we would not attempt an IRQ >= NR_IRQS -- which would fail anyway! And as request_irq() fails for an irq >= NR_IRQS, the setting code path: pcmcia_used_irq[irq]++; is never reached as well. Reported-by: Christoph Fritz <chf.fritz@googlemail.com> CC: stable@kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
| * | pcmcia: do not request windows if you don't need toDominik Brodowski2010-08-035-140/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several drivers contained dummy code to request for memory windows, even though they never made use of it. Remove all such code snippets. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: insert PCMCIA device resources into resource treeDominik Brodowski2010-08-036-34/+60
| | | | | | | | | | | | | | | | | | | | | | | | Insert PCMCIA device resources into the resource tree. However, this is currently only implemented for sockets which do not statically map the resources. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: export resource information to sysfsDominik Brodowski2010-08-031-0/+13
| | | | | | | | | | | | Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: use struct resource for PCMCIA devices, part 2Dominik Brodowski2010-08-0310-78/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use struct resource * also for iomem resources. CC: linux-mtd@lists.infradead.org CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: remove memreq_tDominik Brodowski2010-08-0318-105/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Page already had to be set to 0; Offset can easily be passed as parameter to pcmcia_map_mem_page. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: linux-bluetooth@vger.kernel.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org CC: Michael Buesch <mb@bu3sch.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: move local definitions out of include/pcmcia/cs.hDominik Brodowski2010-08-033-19/+6
| | | | | | | | | | | | Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: do not use io_req_t when calling pcmcia_request_io()Dominik Brodowski2010-08-0357-584/+527
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of io_req_t, drivers are now requested to fill out struct pcmcia_device *p_dev->resource[0,1] for up to two ioport ranges. After a call to pcmcia_request_io(), the ports found there are reserved, after calling pcmcia_request_configuration(), they may be used. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org CC: Michael Buesch <mb@bu3sch.de> Acked-by: Marcel Holtmann <marcel@holtmann.org> (for drivers/bluetooth/) Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: do not use io_req_t after call to pcmcia_request_io()Dominik Brodowski2010-08-0351-240/+223
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After pcmcia_request_io(), do not make use of the values stored in io_req_t, but instead use those found in struct pcmcia_device->resource[]. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> (for drivers/bluetooth/) Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: use struct resource for PCMCIA devicesDominik Brodowski2010-08-035-73/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new field into struct pcmcia_device named "resource" and of type struct resource *, which contains the IO port ranges allocated for this device. Memory window ranges and registration with the resource trees will follow at a later date. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: clean up cs.hDominik Brodowski2010-08-032-12/+3
| | | | | | | | | | | | | | | | | | Remove some obsolete definitions from cs.h Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: use pcmica_{read,write}_config_byteDominik Brodowski2010-08-0312-178/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use pcmcia_read_config_byte and pcmcia_write_config_byte instead of pcmcia_access_configuration_register. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-serial@vger.kernel.org CC: Michael Buesch <mb@bu3sch.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: remove cs_types.hDominik Brodowski2010-07-3090-151/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove cs_types.h which is no longer needed: Most definitions aren't used at all, a few can be made away with, and two remaining definitions (typedefs, unfortunatley) may be moved to more specific places. CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> (for drivers/bluetooth/) Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: remove unused flag, simplify headersDominik Brodowski2010-07-303-26/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we only provide one way to set up resources now, we can remove the resource-setup-related bitfield (except resource_setup_done). In addition, pcmcia_state only consisted of one entry, so remove this bitfield as well. Suggested-by: Komuro <komurojun-mbn@nifty.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: remove obsolete CS_EVENT_ definitionsDominik Brodowski2010-07-301-45/+0
| | | | | | | | | | | | | | | | | | | | | Remove some definitions which became obsolete when the central event handler got removed. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: split up central event handlerDominik Brodowski2010-07-303-92/+54
| | | | | | | | | | | | | | | | | | | | | Split up the central event handler for 16bit cards into three individual functions. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: simplify event callbackDominik Brodowski2010-07-302-40/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | The event callback for handling 16bit PCMCIA cards only needs to be informed about a few events. Furthermore, send_event may already only be called with skt->skt_mutex held, which also protects against the module being removed behind the callback's back. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | pcmcia: remove obsolete ioctlDominik Brodowski2010-07-3010-1383/+7
| |/ | | | | | | Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
* | Merge branch 'linux-next' of ↵Linus Torvalds2010-08-0639-94/+517
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (30 commits) PCI: update for owner removal from struct device_attribute PCI: Fix warnings when CONFIG_DMI unset PCI: Do not run NVidia quirks related to MSI with MSI disabled x86/PCI: use for_each_pci_dev() PCI: use for_each_pci_dev() PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc() PCI: export SMBIOS provided firmware instance and label to sysfs PCI: Allow read/write access to sysfs I/O port resources x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY} PCI: disable mmio during bar sizing PCI: MSI: Remove unsafe and unnecessary hardware access PCI: Default PCIe ASPM control to on and require !EMBEDDED to disable PCI: kernel oops on access to pci proc file while hot-removal PCI: pci-sysfs: remove casts from void* ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe PCI hotplug: make sure child bridges are enabled at hotplug time PCI hotplug: shpchp: Removed check for hotplug of display devices PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_device PCI: Don't enable aspm before drivers have had a chance to veto it ...
| * | PCI: update for owner removal from struct device_attributeStephen Rothwell2010-08-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Fixes the build. Acked-by: Guenter Roeck <guenter.roeck@ericsson.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: Fix warnings when CONFIG_DMI unsetNarendra K2010-08-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the below warnings introduced by the commit 911e1c9b05a8e3559a7aa89083930700a0b9e7ee ("PCI: export SMBIOS provided firmware instance and label to sysfs"). drivers/pci/pci.h: In function ‘pci_create_firmware_label_files’: drivers/pci/pci.h:16: warning: ‘return’ with a value, in function returning void drivers/pci/pci.h: In function ‘pci_remove_firmware_label_files’: drivers/pci/pci.h:18: warning: ‘return’ with a value, in function returning void The warnings are seen because of the below code, doing a retun 0 from the functions 'pci_create_firmware_label_files' and 'pci_remove_firmware_label_files' defined as void. +#ifndef CONFIG_DMI +static inline void pci_create_firmware_label_files(struct pci_dev *pdev) +{ return 0; } +static inline void pci_remove_firmware_label_files(struct pci_dev *pdev) +{ return 0; } Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: Do not run NVidia quirks related to MSI with MSI disabledRafael J. Wysocki2010-07-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no reason to run NVidia-specific quirks related to HT MSI mappings with MSI disabled via pci=nomsi, so make __nv_msi_ht_cap_quirk() return immediately in that case. This allows at least one machine to boot 100% of the time with pci=nomsi (it still doesn't boot reliably without that). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16443 . Cc: stable@kernel.org Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | x86/PCI: use for_each_pci_dev()Kulikov Vasiliy2010-07-301-3/+3
| | | | | | | | | | | | | | | | | | | | | Use for_each_pci_dev() to simplify the code. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: use for_each_pci_dev()Kulikov Vasiliy2010-07-305-7/+6
| | | | | | | | | | | | | | | | | | | | | Use for_each_pci_dev() to simplify the code. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()Ben Hutchings2010-07-305-8/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove unsafe and unnecessary hardware access" changed read_msi_msg_desc() to return the last MSI message written instead of reading it from the device, since it may be called while the device is in a reduced power state. However, the pSeries platform code really does need to read messages from the device, since they are initially written by firmware. Therefore: - Restore the previous behaviour of read_msi_msg_desc() - Add new functions get_cached_msi_msg{,_desc}() which return the last MSI message written - Use the new functions where appropriate Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: export SMBIOS provided firmware instance and label to sysfsNarendra K2010-07-307-0/+221
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch exports SMBIOS provided firmware instance and label of onboard PCI devices to sysfs. New files are: /sys/bus/pci/devices/.../label which contains the firmware name for the device in question, and /sys/bus/pci/devices/.../index which contains the firmware device type instance for the given device. Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com> Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: Allow read/write access to sysfs I/O port resourcesAlex Williamson2010-07-302-2/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCI sysfs resource files currently only allow mmap'ing. On x86 this works fine for memory backed BARs, but doesn't work at all for I/O port backed BARs. Add read/write to I/O port PCI sysfs resource files to allow userspace access to these device regions. Acked-by: Chris Wright <chrisw@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLANBjorn Helgaas2010-07-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This DMI quirk turns on "pci=use_crs" for the ALiveSATA2-GLAN because amd_bus.c doesn't handle this system correctly. The system has a single HyperTransport I/O chain, but has two PCI host bridges to buses 00 and 80. amd_bus.c learns the MMIO range associated with buses 00-ff and that this range is routed to the HT chain hosted at node 0, link 0: bus: [00, ff] on node 0 link 0 bus: 00 index 1 [mem 0x80000000-0xfcffffffff] This includes the address space for both bus 00 and bus 80, and amd_bus.c assumes it's all routed to bus 00. We find device 80:01.0, which BIOS left in the middle of that space, but we don't find a bridge from bus 00 to bus 80, so we conclude that 80:01.0 is unreachable from bus 00, and we move it from the original, working, address to something outside the bus 00 aperture, which does not work: pci 0000:80:01.0: reg 10: [mem 0xfebfc000-0xfebfffff 64bit] pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit] The BIOS told us everything we need to know to handle this correctly, so we're better off if we just pay attention, which lets us leave the 80:01.0 device at the original, working, address: ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f]) pci_root PNP0A03:00: host bridge window [mem 0x80000000-0xff37ffff] ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff]) pci_root PNP0A08:00: host bridge window [mem 0xfebfc000-0xfebfffff] This was a regression between 2.6.33 and 2.6.34. In 2.6.33, amd_bus.c was used only when we found multiple HT chains. 3e3da00c01d050, which enabled amd_bus.c even on systems with a single HT chain, caused this failure. This quirk was written by Graham. If we ever enable "pci=use_crs" for machines from 2006 or earlir, this quirk should be removed. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16007 Cc: stable@kernel.org Reported-by: Graham Ramsey <ramsey.graham@ntlworld.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY}FUJITA Tomonori2010-07-301-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 2.6.34, we transformed the PCI DMA API into the generic device mode. The PCI DMA API is just the wrapper of the DMA API. So we don't need HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_SIZE or HAVE_ARCH_PCI_SET_DMA_SEGMENT_BOUNDARY (which enable architectures to have the own implementations). Both haven't been used anyway. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: disable mmio during bar sizingJacob Pan2010-07-303-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a known issue that mmio decoding shall be disabled while doing PCI bar sizing. Host bridge and other devices (PCI PIC) shall be excluded for certain platforms. This patch mainly comes from Mathew Willcox's patch in http://kerneltrap.org/mailarchive/linux-kernel/2007/9/13/258969. A new flag bit "mmio_alway_on" is added to pci_dev with the intention that devices with their mmio decoding cannot be disabled during BAR sizing shall have this bit set, preferrablly in their quirks. Without this patch, Intel Moorestown platform graphics unit will be corrupted during bar sizing activities. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: MSI: Remove unsafe and unnecessary hardware accessBen Hutchings2010-07-301-23/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During suspend on an SMP system, {read,write}_msi_msg_desc() may be called to mask and unmask interrupts on a device that is already in a reduced power state. At this point memory-mapped registers including MSI-X tables are not accessible, and config space may not be fully functional either. While a device is in a reduced power state its interrupts are effectively masked and its MSI(-X) state will be restored when it is brought back to D0. Therefore these functions can simply read and write msi_desc::msg for devices not in D0. Further, read_msi_msg_desc() should only ever be used to update a previously written message, so it can always read msi_desc::msg and never needs to touch the hardware. Tested-by: "Michael Chan" <mchan@broadcom.com> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: Default PCIe ASPM control to on and require !EMBEDDED to disableMatthew Garrett2010-07-301-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CONFIG_PCIEASPM option is confusing and potentially dangerous. ASPM is a hardware mediated feature rather than one under direct OS control, and even if the config option is disabled the system firmware may have turned on ASPM on various bits of hardware. This can cause problems later - various hardware that claims to support ASPM does a poor job of it and may hang or cause other difficulties. The kernel is able to recognise this in many cases and disable the ASPM functionality, but only if CONFIG_PCIEASPM is enabled. Given that in its default configuration this option will either leave the hardware as it was originally or disable hardware functionality that may cause problems, it should by default y. The only reason to disable it ought to be to reduce code size, so make it dependent on CONFIG_EMBEDDED. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: lrodriguez@atheros.com Cc: maximlevitsky@gmail.com Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: kernel oops on access to pci proc file while hot-removalKenji Kaneshige2010-07-301-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I encountered the problem that /proc/bus/pci/XX/YY is not removed even after the corresponding device is hot-removed, if the file is still being opened. In addtion, accessing this file in this situation causes kernel panic (see below). Becasue the pci_proc_detach_device() doesn't call remove_proc_entry() if struct proc_dir_entry->count > 1, access to /proc/bus/pci/XX/YY would refer to struct pci_dev that was already freed. Though I don't know why the check for proc_dir_entry->count was added, I don't think it is needed. Removing this check fixes the problem. Steps to reproduce ------------------ # cd /sys/bus/pci/slots/2/ # PROC_BUS_PCI_FILE=/proc/bus/pci/`awk -F: '{print $2"/"$3}' < address`.0 # sleep 10000 < $PROC_BUS_PCI_FILE & # echo 0 > power # while true; do cat $PROC_BUS_PCI_FILE > /dev/null; done Oops Messages ------------- BUG: unable to handle kernel NULL pointer dereference at 00000042 IP: [<c05c82d5>] pci_user_read_config_dword+0x65/0xa0 *pdpt = 000000002185e001 *pde = 0000000476a79067 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:10:00.0/local_cpus Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq ipv6 dm_mirror dm_region_hash dm_log dm_mod e1000e i2c_i801 i2c_core iTCO_wdt igb sg pcspkr dca iTCO_vendor_support ext4 mbcache jbd2 sd_mod crc_t10dif lpfc mptsas scsi_transport_fc mptscsih mptbase scsi_tgt scsi_transport_sas [last unloaded: microcode] Pid: 2997, comm: cat Not tainted 2.6.34-kk #32 SB/PRIMEQUEST 1800E EIP: 0060:[<c05c82d5>] EFLAGS: 00010046 CPU: 19 EIP is at pci_user_read_config_dword+0x65/0xa0 EAX: 00000002 EBX: e44f1800 ECX: e144df14 EDX: 155668c7 ESI: 00000087 EDI: 00000000 EBP: e144df40 ESP: e144df0c DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process cat (pid: 2997, ti=e144c000 task=e26f2570 task.ti=e144c000) Stack: c09ceac0 c0570f72 ffffffff 08c57000 00000000 00001000 e44f1800 c05d2404 <0> e144df40 00001000 00000000 00001000 08c57000 3093ae50 e420cb40 e358d5c0 <0> c05d2300 fffffffb c054984f e144df9c 00008000 08c57000 e358d5c0 00008000 Call Trace: [<c0570f72>] ? security_capable+0x22/0x30 [<c05d2404>] ? proc_bus_pci_read+0x104/0x220 [<c05d2300>] ? proc_bus_pci_read+0x0/0x220 [<c054984f>] ? proc_reg_read+0x5f/0x90 [<c05497f0>] ? proc_reg_read+0x0/0x90 [<c050694d>] ? vfs_read+0x9d/0x190 [<c04958f4>] ? audit_syscall_entry+0x204/0x230 [<c0506a81>] ? sys_read+0x41/0x70 [<c0402f1f>] ? sysenter_do_call+0x12/0x28 Code: b4 26 00 00 00 00 b8 20 88 b1 c0 c7 44 24 08 ff ff ff ff e8 3e 52 22 00 f6 83 24 04 00 00 20 75 34 8b 43 08 8d 4c 24 08 8b 53 1c <8b> 70 40 89 4c 24 04 89 f9 c7 04 24 04 00 00 00 ff 16 89 c6 f0 EIP: [<c05c82d5>] pci_user_read_config_dword+0x65/0xa0 SS:ESP 0068:e144df0c CR2: 0000000000000042 Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: pci-sysfs: remove casts from void*Kulikov Vasiliy2010-07-301-1/+1
| | | | | | | | | | | | | | | | | | | | | Remove unnesessary casts from void*. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | ACPI: Disable ASPM if the platform won't provide _OSC control for PCIeMatthew Garrett2010-07-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PCI SIG documentation for the _OSC OS/firmware handshaking interface states: "If the _OSC control method is absent from the scope of a host bridge device, then the operating system must not enable or attempt to use any features defined in this section for the hierarchy originated by the host bridge." The obvious interpretation of this is that the OS should not attempt to use PCIe hotplug, PME or AER - however, the specification also notes that an _OSC method is *required* for PCIe hierarchies, and experimental validation with An Alternative OS indicates that it doesn't use any PCIe functionality if the _OSC method is missing. That arguably means we shouldn't be using MSI or extended config space, but right now our problems seem to be limited to vendors being surprised when ASPM gets enabled on machines when other OSs refuse to do so. So, for now, let's just disable ASPM if the _OSC method doesn't exist or refuses to hand over PCIe capability control. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI hotplug: make sure child bridges are enabled at hotplug timeYinghai Lu2010-07-301-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found one PCIe Module with several bridges built-in where a "cold" hotadd doesn't work. If we end up reassigning bridge windows at hotadd time, and have to loop through assigning new ranges, we won't end up enabling the child bridges because the first assignment pass already tried to enable them, which prevents __pci_bridge_assign_resource from updating the windows. So try to move enabling of child bridges to the end, and only do it once. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI hotplug: shpchp: Removed check for hotplug of display devicesPraveen Kalamegham2010-07-301-15/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed check to prevent hotplug of display devices within shpchp. Originally this was thought to have been required within the PCI Hotplug specification for some legacy devices. However there is no such requirement in the most recent revision. The check prevents hotplug of not only display devices but also computational GPUs which require serviceability. Signed-off-by: Praveen Kalamegham <praveen@nextio.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_devicePraveen Kalamegham2010-07-301-1/+1
| | | | | | | | | | | | | | | | | | | | | pciehp_unconfigure_device() should return -EINVAL, not EINVAL. Signed-off-by: Praveen Kalamegham <praveen@nextio.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: Don't enable aspm before drivers have had a chance to veto itMatthew Garrett2010-07-301-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The aspm code will currently set the configured aspm policy before drivers have had an opportunity to indicate that their hardware doesn't support it. Unfortunately, putting some hardware in L0 or L1 can result in the hardware no longer responding to any requests, even after aspm is disabled. It makes more sense to leave aspm policy at the BIOS defaults at initial setup time, reconfiguring it after pci_enable_device() is called. This allows the driver to blacklist individual devices beforehand. Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: fix wrong memory address handling in MSI-XKenji Kaneshige2010-07-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use resource_size_t for MMIO address instead of unsigned long. Otherwise, higher 32-bits of MMIO address are cleared unexpectedly in x86-32 PAE. Acked-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
OpenPOWER on IntegriCloud