summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* IB/iser: Nit - add space after __func__ in iser loggingSagi Grimberg2014-10-091-8/+8
| | | | | | | | Change logging: "iser:XXXX" to "iser: XXXX" Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Change iscsi_conn_stop log level to infoAriel Nahum2014-10-091-1/+1
| | | | | | | | Match to the debug level of all functions in connect/disconnect flows. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Suppress scsi command send completionsSagi Grimberg2014-10-093-7/+16
| | | | | | | | | | | | | | Singal completion of every 32 scsi commands and suppress all the rest. We don't do anything upon getting the completion so no need to "just consume" it. Cleanup of scsi command is done in cleanup_task callback. Still keep dataout and control send completions as we may need to cleanup there. This helps reducing the amount of interrupts/completions in the IO path. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Optimize completion pollingSagi Grimberg2014-10-092-5/+11
| | | | | | | | | Poll in batch of 16. Since we don't want it on the stack, keep under iser completion context (iser_comp). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Use beacon to indicate all completions were consumedSagi Grimberg2014-10-093-22/+23
| | | | | | | | | | Avoid post_send counting (atomic) in the IO path just to keep track of how many completions we need to consume. Use a beacon post to indicate that all prior posts completed. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Use single CQ for RX and TXSagi Grimberg2014-10-093-125/+114
| | | | | | | | | | | | | | | This will solve a possible condition where we might miss TX completion (flush error) during session teardown. Since we are using a single CQ, we don't need to actively drain the TX CQ, instead just wait for flush_completion (when counters reach zero) and remove iser_poll_for_flush_errors(). This patch might introduce a minor performance regression on its own, but the next patches will enhance performance using a single CQ for RX and TX. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Use internal polling budget to avoid possible live-lockSagi Grimberg2014-10-091-0/+4
| | | | | | | | | | | | We need a way to guarentee that we don't stay in soft-IRQ context for too long. We might starve other pending CQ tasklets or worse lock against application trying to issue IO on the running CPU. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Centralize iser completion contextsSagi Grimberg2014-10-092-87/+84
| | | | | | | | | Introduce iser_comp which centralizes all iser completion related items and is referenced by iser_device and each ib_conn. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Use iser_warn instead of BUG_ON in iser_conn_releaseAriel Nahum2014-10-091-1/+3
| | | | | | | | | | | | | | | In case iscsid was violently killed (SIGKILL) during its error recovery stage, we may never get a connection teardown sequence for some of the old connections. No harm done, but when we try to unload the module we will need to cleanup all these connections. So we actually may end-up here - so it's not a BUG_ON(), just give a relaxed warning that this happened and continue with normal unload. BUG_ON() will cause segfault on module_exit and we don't want that. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Signal iSCSI layer that transport is broken in error completionsSagi Grimberg2014-10-091-4/+25
| | | | | | | | | | | | | | | | | Previously we notified iscsi layer about the connection layer when we consumed all of our flush errors. This was racy as there was no guarentee that iscsi_conn wasn't terminated by then (which ends up in an invalid memory access). In case we got a non FLUSH error completion, we are guarenteed that iscsi_conn is still alive. We should notify iSCSI layer with iscsi_conn_failure to initiate error handling. While we are at it, add a nice kernel-doc style documentation. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Protect tasks cleanup in case IB device was already releasedSagi Grimberg2014-10-091-1/+7
| | | | | | | | | | | | Bailout in case a task cleanup (iscsi_iser_cleanup_task) is called after the IB device was removed (DEVICE_REMOVAL CM event). We also call iscsi_conn_stop with a lock taken to prevent DEVICE_REMOVAL and tasks cleanup from racing. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Unbind at conn_stop stageAriel Nahum2014-10-091-0/+7
| | | | | | | | | | | | | | | | | | | | Previously we didn't need to unbind the iser_conn and iscsi_conn since we always relied on iscsi daemon to teardown the connection and never let it finish before we cleanup all that is needed in iser. This is not the case anymore (for DEVICE_REMOVAL event). So avoid any possible chance we cause iscsi_conn dereference after iscsi_conn was freed. We also call iser_conn_terminate (safe to call multiple times) just for the corner case of iscsi daemon stopping an old connection before invoking endpoint removal (might happen if it was violently killed). Notice we are unbinding under a lock - which is required. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Don't bound release_work completions timeoutsSagi Grimberg2014-10-091-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | We no longer rely on iscsi connection teardown sequence, so no need to give a grace period and continue cleanup if it expired. Have iser_conn_release wait for full completion before freeing iser_conn. ib_completion: Guaranteed to come when: - Got DISCONNECTED/ADDR_CHANGE event or - iSCSI called ep_disconnect/conn_stop Guaranteed to finish when: - Got TIMEWAIT_EXIT/DEVICE_REMOVAL event - All Flush errors are consumed - IB related resources are destroyed stop_completion: Guaranteed to come when: - iSCSI calls conn_stop Guaranteed to finish when: - All inflight tasks were cleaned up Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Fix DEVICE REMOVAL handling in the absence of iscsi daemonSagi Grimberg2014-10-092-61/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iscsi daemon is in user-space, thus we can't rely on it to be invoked at connection teardown (if not running or does not receive CPU time). This patch addresses the issue by re-structuring iSER connection teardown logic and CM events handling. The CM events will dictate the RDMA resources destruction (ib_conn) and iser_conn is kept around as long as iscsi_conn is left around allowing iscsi/iser callbacks to continue after RDMA transport was destroyed. This patch introduces a separation in logic when handling CM events: - DISCONNECTED_HANDLER, ADDR_CHANGED This events indicate the start of teardown process. Actions: 1. Terminate the connection: rdma_disconnect (send DREQ/DREP) 2. Notify iSCSI of connection failure 3. Change state to TERMINATING 4. Poll for all flush errors to be consumed - TIMEWAIT_EXIT, DEVICE_REMOVAL These events indicate the final stage of termination process and we can free RDMA related resources. Actions: 1. Call disconnected handler (we are not guaranteed that DISCONNECTED event was invoked in the past) 2. Cleanup RDMA related resources 3. For DEVICE_REMOVAL return non-zero rc from cma_handler to implicitly destroy the cm_id (Can't rely on user-space, make sure we have forward progress) We replace flush_completion (indicate all flushes were consumed) with ib_completion (rdma resources were cleaned up). The iser_conn_release_work will wait for teardown completions: - conn_stop was completed (tasks were cleaned-up) - stop_completion - RDMA resources were destroyed - ib_completion And then will continue to free iser connection representation (iser_conn). Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Extend iser_free_ib_conn_res()Sagi Grimberg2014-10-091-28/+30
| | | | | | | | | | | | Put all connection IB related resources release in this routine. One exception is the cm_id which cannot be destroyed as the routine is protected by the state mutex. Also move its position to avoid forward declaration. While at it fix qp NULL assignment. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Remove unused variables and dead codeRoi Dayan2014-10-091-4/+0
| | | | | | | Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Re-introduce ib_connSagi Grimberg2014-10-095-204/+245
| | | | | | | | | | | Structure that describes the RDMA relates connection objects. Static member of iser_conn. This patch does not change any functionality Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/iser: Rename ib_conn -> iser_connSagi Grimberg2014-10-095-392/+403
| | | | | | | | | | | | | | | | Two reasons why we choose to do this: 1. No point today calling struct iser_conn by another name ib_conn 2. In the next patches we will restructure iser control plane representation - struct iser_conn: connection logical representation - struct ib_conn: connection RDMA layout representation This patch does not change any functionality. Signed-off-by: Ariel Nahum <arieln@mellanox.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* Linux 3.17-rc7v3.17-rc7Linus Torvalds2014-09-281-1/+1
|
* Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds2014-09-281-0/+5
|\ | | | | | | | | | | | | | | | | | | Pull slave-dmaengine fixes from Vinod Koul: "Two small fixes for omap dmaengine driver which fixes cyclic suspend and resume" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: omap-dma: Restore the CLINK_CTRL in resume path dmaengine: omap-dma: Add memory barrier to dma_resume path
| * dmaengine: omap-dma: Restore the CLINK_CTRL in resume pathPeter Ujfalusi2014-09-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | When the audio stream is paused or suspended we stop the sDMA and when it is unpaused/resumed we start the channel without reconfiguring it. The omap_dma_stop() clears the link configuration when we pause the dma, but it is not setting it back on start. This will result only one audio buffer to be played back and the DMA will stop, since the linking is disabled. We need to restore the CLINK_CTRL register in case of resume. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
| * dmaengine: omap-dma: Add memory barrier to dma_resume pathPeter Ujfalusi2014-09-231-0/+2
| | | | | | | | | | | | | | | | | | Add mb() call to resume path to ensure the necessary barrier. Resume can happen after waking up from suspend for example. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2014-09-278-85/+60
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Assorted fixes + unifying __d_move() and __d_materialise_dentry() + minimal regression fix for d_path() of victims of overwriting rename() ported on top of that" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Don't exchange "short" filenames unconditionally. fold swapping ->d_name.hash into switch_names() fold unlocking the children into dentry_unlock_parents_for_move() kill __d_materialise_dentry() __d_materialise_dentry(): flip the order of arguments __d_move(): fold manipulations with ->d_child/->d_subdirs don't open-code d_rehash() in d_materialise_unique() pull rehashing and unlocking the target dentry into __d_materialise_dentry() ufs: deal with nfsd/iget races fuse: honour max_read and max_write in direct_io mode shmem: fix nlink for rename overwrite directory
| * | vfs: Don't exchange "short" filenames unconditionally.Mikhail Efremov2014-09-271-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only exchange source and destination filenames if flags contain RENAME_EXCHANGE. In case if executable file was running and replaced by other file /proc/PID/exe should still show correct file name, not the old name of the file by which it was replaced. The scenario when this bug manifests itself was like this: * ALT Linux uses rpm and start-stop-daemon; * during a package upgrade rpm creates a temporary file for an executable to rename it upon successful unpacking; * start-stop-daemon is run subsequently and it obtains the (nonexistant) temporary filename via /proc/PID/exe thus failing to identify the running process. Note that "long" filenames (> DNAiME_INLINE_LEN) are still exchanged without RENAME_EXCHANGE and this behaviour exists long enough (should be fixed too apparently). So this patch is just an interim workaround that restores behavior for "short" names as it was before changes introduced by commit da1ce0670c14 ("vfs: add cross-rename"). See https://lkml.org/lkml/2014/9/7/6 for details. AV: the comments about being more careful with ->d_name.hash than with ->d_name.name are from back in 2.3.40s; they became obsolete by 2.3.60s, when we started to unhash the target instead of swapping hash chain positions followed by d_delete() as we used to do when dcache was first introduced. Acked-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: da1ce0670c14 "vfs: add cross-rename" Signed-off-by: Mikhail Efremov <sem@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fold swapping ->d_name.hash into switch_names()Linus Torvalds2014-09-271-2/+1
| | | | | | | | | | | | | | | | | | | | | and do it along with ->d_name.len there Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fold unlocking the children into dentry_unlock_parents_for_move()Al Viro2014-09-261-5/+4
| | | | | | | | | | | | | | | | | | | | | ... renaming it into dentry_unlock_for_move() and making it more symmetric with dentry_lock_for_move(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | kill __d_materialise_dentry()Al Viro2014-09-261-44/+10
| | | | | | | | | | | | | | | | | | it folds into __d_move() now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | __d_materialise_dentry(): flip the order of argumentsAl Viro2014-09-261-24/+20
| | | | | | | | | | | | | | | | | | | | | ... thus making it much closer to (now unreachable, BTW) IS_ROOT(dentry) case in __d_move(). A bit more and it'll fold in. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | __d_move(): fold manipulations with ->d_child/->d_subdirsAl Viro2014-09-261-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | list_del() + list_add() is a slightly pessimised list_move() list_del() + INIT_LIST_HEAD() is a slightly pessimised list_del_init() Interleaving those makes the resulting code even worse. And harder to follow... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | don't open-code d_rehash() in d_materialise_unique()Al Viro2014-09-261-5/+1
| | | | | | | | | | | | | | | | | | ... and get rid of duplicate BUG_ON() there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | pull rehashing and unlocking the target dentry into __d_materialise_dentry()Al Viro2014-09-261-7/+4
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | ufs: deal with nfsd/iget racesAl Viro2014-09-262-1/+9
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fuse: honour max_read and max_write in direct_io modeMiklos Szeredi2014-09-264-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The third argument of fuse_get_user_pages() "nbytesp" refers to the number of bytes a caller asked to pack into fuse request. This value may be lesser than capacity of fuse request or iov_iter. So fuse_get_user_pages() must ensure that *nbytesp won't grow. Now, when helper iov_iter_get_pages() performs all hard work of extracting pages from iov_iter, it can be done by passing properly calculated "maxsize" to the helper. The other caller of iov_iter_get_pages() (dio_refill_pages()) doesn't need this capability, so pass LONG_MAX as the maxsize argument here. Fixes: c9c37e2e6378 ("fuse: switch to iov_iter_get_pages()") Reported-by: Werner Baumann <werner.baumann@onlinehome.de> Tested-by: Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | shmem: fix nlink for rename overwrite directoryMiklos Szeredi2014-09-261-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If overwriting an empty directory with rename, then need to drop the extra nlink. Test prog: #include <stdio.h> #include <fcntl.h> #include <err.h> #include <sys/stat.h> int main(void) { const char *test_dir1 = "test-dir1"; const char *test_dir2 = "test-dir2"; int res; int fd; struct stat statbuf; res = mkdir(test_dir1, 0777); if (res == -1) err(1, "mkdir(\"%s\")", test_dir1); res = mkdir(test_dir2, 0777); if (res == -1) err(1, "mkdir(\"%s\")", test_dir2); fd = open(test_dir2, O_RDONLY); if (fd == -1) err(1, "open(\"%s\")", test_dir2); res = rename(test_dir1, test_dir2); if (res == -1) err(1, "rename(\"%s\", \"%s\")", test_dir1, test_dir2); res = fstat(fd, &statbuf); if (res == -1) err(1, "fstat(%i)", fd); if (statbuf.st_nlink != 0) { fprintf(stderr, "nlink is %lu, should be 0\n", statbuf.st_nlink); return 1; } return 0; } Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge branch 'for-3.17-fixes' of ↵Linus Torvalds2014-09-276-24/+43
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "This is quite late but these need to be backported anyway. This is the fix for a long-standing cpuset bug which existed from 2009. cpuset makes use of PF_SPREAD_{PAGE|SLAB} flags to modify the task's memory allocation behavior according to the settings of the cpuset it belongs to; unfortunately, when those flags have to be changed, cpuset did so directly even whlie the target task is running, which is obviously racy as task->flags may be modified by the task itself at any time. This obscure bug manifested as corrupt PF_USED_MATH flag leading to a weird crash. The bug is fixed by moving the flag to task->atomic_flags. The first two are prepatory ones to help defining atomic_flags accessors and the third one is the actual fix" * 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags sched: add macros to define bitops for task atomic flags sched: fix confusing PFA_NO_NEW_PRIVS constant
| * | | cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flagsZefan Li2014-09-245-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we change cpuset.memory_spread_{page,slab}, cpuset will flip PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset. This should be done using atomic bitops, but currently we don't, which is broken. Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened when one thread tried to clear PF_USED_MATH while at the same time another thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on the same task. Here's the full report: https://lkml.org/lkml/2014/9/19/230 To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags. v4: - updated mm/slab.c. (Fengguang Wu) - updated Documentation. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Kees Cook <keescook@chromium.org> Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time") Cc: <stable@vger.kernel.org> # 2.6.31+ Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | sched: add macros to define bitops for task atomic flagsZefan Li2014-09-242-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will simplify code when we add new flags. v3: - Kees pointed out that no_new_privs should never be cleared, so we shouldn't define task_clear_no_new_privs(). we define 3 macros instead of a single one. v2: - updated scripts/tags.sh, suggested by Peter Cc: Ingo Molnar <mingo@kernel.org> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | sched: fix confusing PFA_NO_NEW_PRIVS constantZefan Li2014-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1d4457f99928 ("sched: move no_new_privs into new atomic flags") defined PFA_NO_NEW_PRIVS as hexadecimal value, but it is confusing because it is used as bit number. Redefine it as decimal bit number. Note this changes the bit position of PFA_NOW_NEW_PRIVS from 1 to 0. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Kees Cook <keescook@chromium.org> [ lizf: slightly modified subject and changelog ] Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | | Merge tag 'fixes-for-linus' of ↵Linus Torvalds2014-09-2713-77/+139
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Here's our last set of fixes for 3.17. Most of these are for TI platforms, fixing some noisy Kconfig issues, runtime clock and power issues on several platforms and NAND timings on DRA7. There are also a couple of bug fixes for i.MX, one for QCOM and a small fix to avoid section mismatch noise on PXA. Diffstat looks large, partially due to some tables being updated and thus touching many lines. The qcom gsbi change also restructures clock management a bit and thus touches a bunch of lines. All in all, a bit more changes than we'd like at this point, but nothing stands out as risky either so it seems like the right thing to send it up now instead of holding it to the merge window" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: drivers/soc: qcom: do not disable the iface clock in probe ARM: imx: fix .is_enabled() of shared gate clock ARM: OMAP3: Fix I/O chain clock line assertion timed out error ARM: keystone: dts: fix bindings for pcie and usb clock nodes bus: omap_l3_noc: Fix connID for OMAP4 ARM: DT: imx53: fix lvds channel 1 port ARM: dts: cm-t54: fix serial console power supply. ARM: dts: dra7-evm: Fix NAND GPMC timings ARM: pxa: fix section mismatch warning for pxa_timer_nodt_init ARM: OMAP: Fix Kconfig warning for omap1
| * | | | drivers/soc: qcom: do not disable the iface clock in probeSrinivas Kandagatla2014-09-231-13/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | since commit 31964ffebbb9 ("tty: serial: msm: Remove direct access to GSBI")' serial hangs if earlyprintk are enabled. This hang is noticed only when the GSBI driver is probed and all the earlyprintks before gsbi probe are seen on the console. The reason why it hangs is because GSBI driver disables hclk in its probe function without realizing that the serial IP might be in use by a bootconsole. As gsbi driver disables the clock in probe the bootconsole locks up. Turning off hclk's could be dangerous if there are system components like earlyprintk using the hclk. This patch fixes the issue by delegating the clock management to probe and remove functions in gsbi rather than disabling the clock in probe. More detailed problem description can be found here: http://www.spinics.net/lists/linux-arm-msm/msg10589.html Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
| * | | | Merge tag 'fix-v3.17-io-chain-v3' of ↵Olof Johansson2014-09-222-5/+36
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Regression fix for early omap3 revisions for wake-up events that too some time to narrow down. Although a bit intrusive, this would be good to get into the -rc cycle as there are quite a few boards out there with omap3 es2.1 and es3.0, and we have those in at least three boot test systems too that show errors without this patch. * tag 'fix-v3.17-io-chain-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP3: Fix I/O chain clock line assertion timed out error Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | | | ARM: OMAP3: Fix I/O chain clock line assertion timed out errorTony Lindgren2014-09-162-5/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are getting "PRM: I/O chain clock line assertion timed out" errors on early omaps for device tree based booting. This is because we are unconditionally calling reconfigure_io_chain while legacy booting has omap3_has_io_chain_ctrl() checks in place in omap_hwmod.c. For device tree based booting, we are calling reconfigure_io_chain unconditionally from pinctrl framework. So we need to add a check for omap3_has_io_chain_ctrl() to avoid the errors for trying to access a register that does not exist. For es3.0, the documentation in "4.11.2 Device Off-Mode Configuration" just mentions PM_WKEN_WKUP[8] bit. For es3.1, there's a new chapter in documentation for "4.11.2.2 I/O Wake-Up Mechanism" that describes the PM_WKEN_WKUP[16] ST_IO_CHAIN bit. So PM_WKEN_WKUP[16] bit did not get added until in es3.1 probaly to fix issues with flakey wake-up events. We are doing proper checks for ST_IO_CHAIN already in id.c and with omap3_has_io_chain_ctrl(). For more information, see also commit b02b917211d5 ("ARM: OMAP3: PM: fix I/O wakeup and I/O chain clock control detection"). Let's fix the issue by selecting the right function during init for reconfigure_io_chain depending on the omap revision. For es3.0 and earlier we need to just toggle EN_IO. By doing this, we can move the check for omap3_has_io_chain_ctrl() from omap_hwmod.c to the init code in prm_3xxx.c. And then we can unconditionally call reconfigure_io_chain. Thanks to Paul Walmsley and Nishanth Menon for help with debugging the issue. Fixes: 30a69ef785e8 ("ARM: OMAP: Move DT wake-up event handling over to use pinctrl-single-omap") Cc: Kevin Hilman <khilman@kernel.org> Cc: Paul Walmsley <paul@pwsan.com> Cc: Tero Kristo <t-kristo@ti.com> Reviewed-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
| * | | | | Merge tag 'fixes-v3.17-rc4' of ↵Olof Johansson2014-09-223-43/+39
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Few regression fixes for omaps for the -rc cycle: - Fix for omap_l3_noc bus code - Serial console fix for cm-t53 - NAND timings fix for dra7-evm * tag 'fixes-v3.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: bus: omap_l3_noc: Fix connID for OMAP4 ARM: dts: cm-t54: fix serial console power supply. ARM: dts: dra7-evm: Fix NAND GPMC timings Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | | | | bus: omap_l3_noc: Fix connID for OMAP4Nishanth Menon2014-09-111-25/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d4d8819e205854c ("bus: omap_l3_noc: fix masterid detection") did the right thing in dropping the LSB 2 bits which is not part of the ConnID for NTTP master address. However, as part of that change, we should also have ensured that existing list of OMAP4 connID codes are also shifted by 2 bits to ensure that connIDs map to "Table 13-18. ConnID Values" as provided in Technical Reference Manuals for OMAP4430(Rev AP, April 2014, SWPU220AP) and OMAP4460(Rev AB, April 2014, SWPU234AB) Fixes: d4d8819e205854c ("bus: omap_l3_noc: fix masterid detection") Reported-by: Kristian Otnes <kotnes@cisco.com> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
| | * | | | | ARM: dts: cm-t54: fix serial console power supply.Dmitry Lifshitz2014-09-101-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LDO8 regulator is used for act led and serial cosole power supply. Its DT status is declared as "disabled", however the serial console was functional until Commit 318dbb02b ("regulator: palmas: Fix SMPS enable/disable/is_enabled") wich properly turns off LDO8 on boot. Fix serial cosole power supply (and act led) on boot by turning LDO8 on. Fixes: 318dbb02b ("regulator: palmas: Fix SMPS enable/disable/is_enabled") Signed-off-by: Dmitry Lifshitz <lifshitz@compulab.co.il> Tested-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
| | * | | | | ARM: dts: dra7-evm: Fix NAND GPMC timingsRoger Quadros2014-09-101-15/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nand timings were scaled down by 2 to account for the 2x rate returned by clk_get_rate(gpmc_fclk). As the clock data got fixed by [1], revert back to actual timings (i.e. scale them up by 2). Without this NAND doesn't work on dra7-evm. [1] - commit dd94324b983afe114ba9e7ee3649313b451f63ce ARM: dts: dra7xx-clocks: Fix the l3 and l4 clock rates Fixes: ff66a3c86e00 ("ARM: dts: dra7: add support for parallel NAND flash") Cc: <stable@vger.kernel.org> [3.16] Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
| * | | | | | Merge tag 'fixes-v3.17-rc4' of ↵Olof Johansson2014-09-221-3/+3
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into fixes Keystone Edision dts fix for -rc cycle. Fix the PCIE and USB nodes. * tag 'fixes-v3.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone: ARM: keystone: dts: fix bindings for pcie and usb clock nodes Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | | | | | ARM: keystone: dts: fix bindings for pcie and usb clock nodesMurali Karicheri2014-09-111-3/+3
| | |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix incorrect clock names for usb1, pcie1 and domain register offset for pcie1 clock nodes on K2E EVM Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
| * | | | | | ARM: imx: fix .is_enabled() of shared gate clockShawn Guo2014-09-221-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 63288b721a80 ("ARM: imx: fix shared gate clock") attempted to fix an issue with particular enable/disable sequence from two shared gate clocks. But unfortunately, while it partially fixed the issue, it also did something wrong in .is_enabled() function hook. In case of shared gate, the function shouldn't really query the hardware state via share_count, because the function is trying to query the enabling state of the clock in question, not the hardware state which is shared by multiple clocks. Fix the issue by returning the enable_count of the clock itself which is maintained by clock core, in case it's a clock sharing hardware gate with others. As the result, the initialization of share_count per hardware state is not needed now. So remove it. Reported-by: Fabio Estevam <fabio.estevam@freescale.com> Fixes: 63288b721a80 ("ARM: imx: fix shared gate clock") Cc: <stable@vger.kernel.org> Signed-off-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Olof Johansson <olof@lixom.net>
| * | | | | | ARM: DT: imx53: fix lvds channel 1 portMarkus Niebel2014-09-112-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | using LVDS channel 1 on an i.MX53 leads to following error: imx-ldb 53fa8008.ldb: unable to set di0 parent clock to ldb_di1 This comes from imx_ldb_set_clock with mux = 0. Mux parameter must be "1" for reparenting di1 clock to ldb_di1. The value of the mux param comes from device tree port settings. On i.MX5, the internal two-input-multiplexer is used. Due to hardware limitations, only one port (port@[0,1]) can be used for each channel (lvds-channel@[0,1], respectively) Documentation update suggested by Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Markus Niebel <Markus.Niebel@tq-group.com> Fixes: e05c8c9a790a ("ARM: dts: imx53: Add IPU DI ports and endpoints, move imx-drm node to dtsi") Cc: <stable@vger.kernel.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Shawn Guo <shawn.guo@freescale.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
OpenPOWER on IntegriCloud