summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* nfsd41: hard page limit for DRCAndy Adamson2009-04-032-0/+5
| | | | | | | | | | | | | | | | Use no more than 1/128th of the number of free pages at nfsd startup for the v4.1 DRC. This is an arbitrary default which should probably end up under the control of an administrator. Signed-off-by: Andy Adamson <andros@netapp.com> [moved added fields in struct svc_serv under CONFIG_NFSD_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [fix set_max_drc calculation of sv_drc_max_pages] [moved NFSD_DRC_SIZE_SHIFT's declaration up in header file] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: DRC save, restore, and clear functionsAndy Adamson2009-04-033-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cache all the result pages, including the rpc header in rq_respages[0], for a request in the slot table cache entry. Cache the statp pointer from nfsd_dispatch which points into rq_respages[0] just past the rpc header. When setting a cache entry, calculate and save the length of the nfs data minus the rpc header for rq_respages[0]. When replaying a cache entry, replace the cached rpc header with the replayed request rpc result header, unless there is not enough room in the cached results first page. In that case, use the cached rpc header. The sessions fore channel maxresponse size cached is set to NFSD_PAGES_PER_SLOT * PAGE_SIZE. For compounds we are cacheing with operations such as READDIR that use the xdr_buf->pages to hold data, we choose to cache the extra page of data rather than copying data from xdr_buf->pages into the xdr_buf->head page. [nfsd41: limit cache to maxresponsesize_cached] [nfsd41: mv nfsd4_set_statp under CONFIG_NFSD_V4_1] [nfsd41: rename nfsd4_move_pages] [nfsd41: rename page_no variable] [nfsd41: rename nfsd4_set_cache_entry] [nfsd41: fix nfsd41_copy_replay_data comment] [nfsd41: add to nfsd4_set_cache_entry] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: sequence operationBenny Halevy2009-04-031-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement the sequence operation conforming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26 Check for stale clientid (as derived from the sessionid). Enforce slotid range and exactly-once semantics using the slotid and seqid. If everything went well renew the client lease and mark the slot INPROGRESS. Add a struct nfsd4_slot pointer to struct nfsd4_compound_state. To be used for sessions DRC replay. [nfsd41: rename sequence catchthis to cachethis] Signed-off-by: Andy Adamson<andros@netapp.com> [pulled some code to set cstate->slot from "nfsd DRC logic"] [use sessionid_lock spin lock] [nfsd41: use bool inuse for slot state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd: add a struct nfsd4_slot pointer to struct nfsd4_compound_state] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: add nfsd4_session pointer to nfsd4_compound_state] [nfsd41: set cstate session] [nfsd41: use cstate session in nfsd4_sequence] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [simplify nfsd4_encode_sequence error handling] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: match clientid establishment methodAndy Adamson2009-04-031-1/+1
| | | | | | | | | | | | | | We need to distinguish between client names provided by NFSv4.0 clients SETCLIENTID and those provided by NFSv4.1 via EXCHANGE_ID when looking up the clientid by string. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [nfsd41: use boolean values for use_exchange_id argument] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: simplify match_clientid_establishment logic] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: exchange_id operationAndy Adamson2009-04-032-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement the exchange_id operation confoming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-28 Based on the client provided name, hash a client id. If a confirmed one is found, compare the op's creds and verifier. If the creds match and the verifier is different then expire the old client (client re-incarnated), otherwise, if both match, assume it's a replay and ignore it. If an unconfirmed client is found, then copy the new creds and verifer if need update, otherwise assume replay. The client is moved to a confirmed state on create_session. In the nfs41 branch set the exchange_id flags to EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_SUPP_MOVED_REFER (pNFS is not supported, Referrals are supported, Migration is not.). Address various scenarios from section 18.35 of the spec: 1. Check for EXCHGID4_FLAG_UPD_CONFIRMED_REC_A and set EXCHGID4_FLAG_CONFIRMED_R as appropriate. 2. Return error codes per 18.35.4 scenarios. 3. Update client records or generate new client ids depending on scenario. Note: 18.35.4 case 3 probably still needs revisiting. The handling seems not quite right. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamosn <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use utsname for major_id (and copy to server_scope)] [nfsd41: fix handling of various exchange id scenarios] Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: reverse use of EXCHGID4_INVAL_FLAG_MASK_A] [simplify nfsd4_encode_exchange_id error handling] [nfsd41: embed an xdr_netobj in nfsd4_exchange_id] [nfsd41: return nfserr_serverfault for spa_how == SP4_MACH_CRED] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: proc stubsAndy Adamson2009-04-031-0/+12
| | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: xdr infrastructureAndy Adamson2009-04-031-0/+22
| | | | | | | | | | | | | | | | | Define nfsd41_dec_ops vector and add it to nfsd4_minorversion for minorversion 1. Note: nfsd4_enc_ops vector is shared for v4.0 and v4.1 since we don't need to filter out obsolete ops as this is done in the decoding phase. exchange_id, create_session, destroy_session, and sequence ops are implemented as stubs returning nfserr_opnotsupp at this stage. [was nfsd41: xdr stubs] [get rid of CONFIG_NFSD_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: sessionid hashingMarc Eshel2009-04-031-0/+7
| | | | | | | | | | | | | | | | | | Simple sessionid hashing using its monotonically increasing sequence number. Locking considerations: sessionid_hashtbl access is controlled by the sessionid_lock spin lock. It must be taken for insert, delete, and lookup. nfsd4_sequence looks up the session id and if the session is found, it calls nfsd4_get_session (still under the sessionid_lock). nfsd4_destroy_session calls nfsd4_put_session after unhashing it, so when the session's kref reaches zero it's going to get freed. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [we don't use a prime for sessionid hash table size] [use sessionid_lock spin lock] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: introduce nfs4_client cl_sessions listMarc Eshel2009-04-031-0/+3
| | | | | | [get rid of CONFIG_NFSD_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: sessions basic data typesAndy Adamson2009-04-031-0/+33
| | | | | | | | | | | | | | | | | | | | | This patch provides basic data structures representing the nfs41 sessions and slots, plus helpers for keeping a reference count on the session and freeing it. Note that our server only support a headerpadsz of 0 and it ignores backchannel attributes at the moment. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: remove headerpadsz from channel attributes] [nfsd41: embed nfsd4_channel in nfsd4_session] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use bool inuse for slot state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41 remove sl_session from nfsd4_slot] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: define nfs41 error codesMarc Eshel2009-04-032-2/+42
| | | | | | | | | | | Define all error code present in http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: clean up error code definitions] [nfsd41: change NFSERR_REPLAY_ME] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfs41: common protocol definitionsBenny Halevy2009-04-031-1/+127
| | | | | | | | | | | | | | | | | | | | | | | Define all NFSv4.1 common operation and error code constants. Note that some of the definitions are used by both the nfs41 client and the server code. This patch is duplicated in the nfs41 and nfsd41 sessions patchset. Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: add exchange id flags] Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [removed server-only hunk changing NFSERR_REPLAY_ME] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: add SEQ4_XX to nfs41-common-protocol] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: generic error code update] [nfs41: reverse EXCHGID4_INVAL_FLAG_MASK_{A,R}] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: don't use the deferral service, return NFS4ERR_DELAYAndy Adamson2009-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be returned. It is up to the NFSv4.1 client to resend only the operations that have not been processed. Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in nfsd4_proc_compound() only when NFSv4.1 Sessions are used. Note: this isn't an adequate solution on its own. It's acceptable as a way to get some minimal 4.1 up and working, but we're going to have to find a way to avoid returning DELAY in all common cases before 4.1 can really be considered ready. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: reverse rq_nodeferral negative logic] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [sunrpc: initialize rq_usedeferral] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: embed nfsd4_current_state in nfsd4_compoundresAndy Adamson2009-03-291-4/+5
| | | | | | | | Remove the allocation of struct nfsd4_compound_state. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Fix a build warning about leaking CONFIG_NFSD to userspace.Greg Banks2009-03-271-4/+5
| | | | | | | | | | | Fix a build warning about leaking CONFIG_NFSD to userspace. The nfsd_stats data structure does not need to be available to userspace; no kernel interface uses it. So move it inside #ifdef __KERNEL__ and the warning goes away. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* SUNRPC: Clean up static inline functions in svc_xprt.hChuck Lever2009-03-181-20/+26
| | | | | | | | Clean up: Enable the use of const arguments in higher level svc_ APIs by adding const to the arguments of the helper functions in svc_xprt.h Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* knfsd: add file to export stats about nfsd poolsGreg Banks2009-03-181-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Add /proc/fs/nfsd/pool_stats to export to userspace various statistics about the operation of rpc server thread pools. This patch is based on a forward-ported version of knfsd-add-pool-thread-stats which has been shipping in the SGI "Enhanced NFS" product since 2006 and which was previously posted: http://article.gmane.org/gmane.linux.nfs/10375 It has also been updated thus: * moved EXPORT_SYMBOL() to near the function it exports * made the new struct struct seq_operations const * used SEQ_START_TOKEN instead of ((void *)1) * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of printk in svc_pool_stats_*()" by Harshula Jayasuriya. * merged fix from SGI PV 964001 "Crash reading pool_stats before nfsds are started". Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* knfsd: avoid overloading the CPU scheduler with enormous load averagesGreg Banks2009-03-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid overloading the CPU scheduler with enormous load averages when handling high call-rate NFS loads. When the knfsd bottom half is made aware of an incoming call by the socket layer, it tries to choose an nfsd thread and wake it up. As long as there are idle threads, one will be woken up. If there are lot of nfsd threads (a sensible configuration when the server is disk-bound or is running an HSM), there will be many more nfsd threads than CPUs to run them. Under a high call-rate low service-time workload, the result is that almost every nfsd is runnable, but only a handful are actually able to run. This situation causes two significant problems: 1. The CPU scheduler takes over 10% of each CPU, which is robbing the nfsd threads of valuable CPU time. 2. At a high enough load, the nfsd threads starve userspace threads of CPU time, to the point where daemons like portmap and rpc.mountd do not schedule for tens of seconds at a time. Clients attempting to mount an NFS filesystem timeout at the very first step (opening a TCP connection to portmap) because portmap cannot wake up from select() and call accept() in time. Disclaimer: these effects were observed on a SLES9 kernel, modern kernels' schedulers may behave more gracefully. The solution is simple: keep in each svc_pool a counter of the number of threads which have been woken but have not yet run, and do not wake any more if that count reaches an arbitrary small threshold. Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16 synthetic client threads simulating an rsync (i.e. recursive directory listing) workload reading from an i386 RH9 install image (161480 regular files in 10841 directories) on the server. That tree is small enough to fill in the server's RAM so no disk traffic was involved. This setup gives a sustained call rate in excess of 60000 calls/sec before being CPU-bound on the server. The server was running 128 nfsds. Profiling showed schedule() taking 6.7% of every CPU, and __wake_up() taking 5.2%. This patch drops those contributions to 3.0% and 2.2%. Load average was over 120 before the patch, and 20.9 after. This patch is a forward-ported version of knfsd-avoid-nfsd-overload which has been shipping in the SGI "Enhanced NFS" product since 2006. It has been posted before: http://article.gmane.org/gmane.linux.nfs/10374 Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Short write in nfsd becomes a full write to the clientDavid Shaw2009-03-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | If a filesystem being written to via NFS returns a short write count (as opposed to an error) to nfsd, nfsd treats that as a success for the entire write, rather than the short count that actually succeeded. For example, given a 8192 byte write, if the underlying filesystem only writes 4096 bytes, nfsd will ack back to the nfs client that all 8192 bytes were written. The nfs client does have retry logic for short writes, but this is never called as the client is told the complete write succeeded. There are probably other ways it could happen, but in my case it happened with a fuse (filesystem in userspace) filesystem which can rather easily have a partial write. Here is a patch to properly return the short write count to the client. Signed-off-by: David Shaw <dshaw@jabberwocky.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd4: remove use of mutex for file_hashtableJ. Bruce Fields2009-03-181-1/+1
| | | | | | | | | | | | | As part of reducing the scope of the client_mutex, and in order to remove the need for mutexes from the callback code (so that callbacks can be done as asynchronous rpc calls), move manipulations of the file_hashtable under the recall_lock. Update the relevant comments while we're here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Alexandros Batsakis <batsakis@netapp.com> Reviewed-by: Benny Halevy <bhalevy@panasas.com>
* nfsd4: remove unused CHECK_FH flagJ. Bruce Fields2009-03-181-1/+0
| | | | | | All users now pass this, so it's meaningless. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd4: separate delegreturn case from preprocess_stateid_opJ. Bruce Fields2009-03-181-1/+0
| | | | | | | | | | | | Delegreturn is enough a special case for preprocess_stateid_op to warrant just open-coding it in delegreturn. There should be no change in behavior here; we're just reshuffling code. Thanks to Yang Hongyang for catching a critical typo. Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfs: replace uses of __constant_{endian}Harvey Harrison2009-03-184-98/+98
| | | | | | | | | | The base versions handle constant folding now, none of these headers are exported to userspace, so the __ prefixed versions are not necessary. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd4: use helper for copying delegation filehandleJ. Bruce Fields2009-03-181-4/+2
| | | | Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd4: use helper for copying filehandles for replayJ. Bruce Fields2009-03-182-2/+8
| | | | Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: nfsd should drop CAP_MKNOD for non-rootJ. Bruce Fields2009-03-171-2/+4
| | | | | | | | | | | | | | Since creating a device node is normally an operation requiring special privilege, Igor Zhbanov points out that it is surprising (to say the least) that a client can, for example, create a device node on a filesystem exported with root_squash. So, make sure CAP_MKNOD is among the capabilities dropped when an nfsd thread handles a request from a non-root user. Reported-by: Igor Zhbanov <izh1979@gmail.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds2009-03-151-1/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master.kernel.org:/home/rmk/linux-2.6-arm: (23 commits) [ARM] Fix virtual to physical translation macro corner cases [ARM] update mach-types [ARM] 5421/1: ftrace: fix crash due to tracing of __naked functions MX1 fix include [ARM] 5419/1: ep93xx: fix build warnings about struct i2c_board_info [ARM] 5418/1: restore lr before leaving mcount ARM: OMAP: board-omap3beagle: set i2c-3 to 100kHz ARM: OMAP: Allow I2C bus driver to be compiled as a module ARM: OMAP: sched_clock() corrected ARM: OMAP: Fix compile error if pm.h is included [ARM] orion5x: pass dram mbus data to xor driver [ARM] S3C64XX: Fix s3c64xx_setrate_clksrc [ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/irq.c [ARM] S3C64XX: sparse warnings in arch/arm/plat-s3c64xx/s3c6400-clock.c [ARM] S3C64XX: Fix USB host clock mux list [ARM] S3C64XX: Fix name of USB host clock. [ARM] S3C64XX: Rename IRQ_UHOST to IRQ_USBH [ARM] S3C64XX: Do gpiolib configuration earlier [ARM] S3C64XX: Staticise s3c64xx_init_irq_eint() [ARM] SMDK6410: Declare iodesc table static ...
| * [ARM] 5421/1: ftrace: fix crash due to tracing of __naked functionsUwe Kleine-König2009-03-121-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix for the following crash observed in 2.6.29-rc3: http://lkml.org/lkml/2009/1/29/150 On ARM it doesn't make sense to trace a naked function because then mcount is called without stack and frame pointer being set up and there is no chance to restore the lr register to the value before mcount was called. Reported-by: Matthias Kaehlcke <matthias@kaehlcke.net> Tested-by: Matthias Kaehlcke <matthias@kaehlcke.net> Cc: Abhishek Sagar <sagar.abhishek@gmail.com> Cc: Steven Rostedt <rostedt@home.goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2009-03-141-2/+2
|\ \ | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Fix Xilinx SystemACE driver to handle empty CF slot block: fix memory leak in bio_clone() block: Add gfp_mask parameter to bio_integrity_clone()
| * | block: Add gfp_mask parameter to bio_integrity_clone()un'ichi Nomura2009-03-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stricter gfp_mask might be required for clone allocation. For example, request-based dm may clone bio in interrupt context so it has to use GFP_ATOMIC. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds2009-03-144-69/+87
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (31 commits) [SCSI] qla2xxx: Update version number to 8.03.00-k4. [SCSI] qla2xxx: Correct overwrite of pre-assigned init-control-block structure size. [SCSI] qla2xxx: Correct truncation in return-code status checking. [SCSI] qla2xxx: Correct vport delete bug. [SCSI] qla2xxx: Use correct value for max vport in LOOP topology. [SCSI] qla2xxx: Correct address range checking for option-rom updates. [SCSI] fcoe: Change fcoe receive thread nice value from 19 (lowest priority) to -20 [SCSI] fcoe: fix handling of pending queue, prevent out of order frames (v3) [SCSI] fcoe: Out of order tx frames was causing several check condition SCSI status [SCSI] fcoe: fix kfree(skb) [SCSI] fcoe: ETH_P_8021Q is already in if_ether and fcoe is not using it anyway [SCSI] libfc: do not change the fh_rx_id of a recevied frame [SCSI] fcoe: Correct fcoe_transports initialization vs. registration [SCSI] fcoe: Use setup_timer() and mod_timer() [SCSI] libfc, fcoe: Remove unnecessary cast by removing inline wrapper [SCSI] libfc, fcoe: Cleanup function formatting and minor typos [SCSI] libfc, fcoe: Fix kerneldoc comments [SCSI] libfc: Cleanup libfc_function_template comments [SCSI] libfc: check for err when recv and state is incorrect [SCSI] libfc: rename rp to rdata in fc_disc_new_target() ...
| * | | [SCSI] fcoe: Out of order tx frames was causing several check condition SCSI ↵Vasu Dev2009-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | status frames followed by these errors in log. [sdp] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK [sdp] Sense Key : Aborted Command [current] [sdp] Add. Sense: Data phase error This was causing some test apps to exit due to write failure under heavy load. This was due to a race around adding and removing tx frame skb in fcoe_pending_queue, Chris Leech helped me to find that brief unlocking period when pulling skb from fcoe_pending_queue in various contexts (fcoe_watchdog and fcoe_xmit) and then adding skb back into fcoe_pending_queue up on a failed fcoe_start_io could change skb/tx frame order in fcoe_pending_queue. Thanks Chris. This patch allows only single context to pull skb from fcoe_pending_queue at any time to prevent above described ordering issue/race by use of fcoe_pending_queue_active flag. This patch simplified fcoe_watchdog with modified fcoe_check_wait_queue by use of FCOE_LOW_QUEUE_DEPTH instead previously used several conditionals to clear and set lp->qfull. I think FCOE_MAX_QUEUE_DEPTH with FCOE_LOW_QUEUE_DEPTH will work better in re/setting lp->qfull and these could be fine tuned for performance. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] fcoe: ETH_P_8021Q is already in if_ether and fcoe is not using it anywayYi Zou2009-03-101-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] libfc, fcoe: Remove unnecessary cast by removing inline wrapperRobert Love2009-03-101-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Comment from "Andrew Morton <akpm@linux-foundation.org>" > +{ > + return (struct fcoe_softc *)lport_priv(lp); unneeded/undesirable cast of void*. There are probably zillions of instances of this - there always are. This whole inline function was unnecessary. The FCoE layer knows that it's data structure is stored in the lport private data, it can just access it from lport_priv(). Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] libfc, fcoe: Fix kerneldoc commentsRobert Love2009-03-101-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Added '()' for function names in kerneldoc comments 2) Changed comment bookends from '**/' to '*/'. The comment on the the mailing list was that '**/' "is consistently unconventional. Not wrong, just odd." The Documentation/kernel-doc-nano-HOWTO.txt states that kerneldoc comment blocks should end with '**/' but most (if not all) instance I found under drivers/scsi/ were only using the '*/' so I converted to that style. 3) Removed incorrect linebreaks in kerneldoc comments where found 4) Removed a few unnecessary blank comment lines in kerneldoc comment blocks Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] libfc: Cleanup libfc_function_template commentsRobert Love2009-03-061-41/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Made the comments more like the comments for struct scsi_host_template. Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] libfc: Don't violate transport template for rogue port creationRobert Love2009-03-061-0/+5
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] libfc: rport retry on LS_RJT from certain ELSChris Leech2009-03-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows any rport ELS to retry on LS_RJT. The rport error handling would only retry on resource allocation failures and exchange timeouts. I have a target that will occasionally reject PLOGI when we do a quick LOGO/PLOGI. When a critical ELS was rejected, libfc would fail silently leaving the rport in a dead state. The retry count and delay are managed by fc_rport_error_retry. If the retry count is exceeded fc_rport_error will be called. When retrying is not the correct course of action, fc_rport_error can be called directly. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] libfc, fcoe: fixed locking issues with lport->lp_mutex around ↵Vasu Dev2009-03-061-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lport->link_status The fcoe_xmit could call fc_pause in case the pending skb queue len is larger than FCOE_MAX_QUEUE_DEPTH, the fc_pause was trying to grab lport->lp_muex to change lport->link_status and that had these issues :- 1. The fcoe_xmit was getting called with bh disabled, thus causing "BUG: scheduling while atomic" when grabbing lport->lp_muex with bh disabled. 2. fc_linkup and fc_linkdown function calls lport_enter function with lport->lp_mutex held and these enter function in turn calls fcoe_xmit to send lport related FC frame, e.g. fc_linkup => fc_lport_enter_flogi to send flogi req. In this case grabbing the same lport->lp_mutex again in fc_puase from fcoe_xmit would cause deadlock. The lport->lp_mutex was used for setting FC_PAUSE in fcoe_xmit path but FC_PAUSE bit was not used anywhere beside just setting and clear this bit in lport->link_status, instead used a separate field qfull in fc_lport to eliminate need for lport->lp_mutex to track pending queue full condition and in turn avoid above described two locking issues. Also added check for lp->qfull in fc_fcp_lport_queue_ready to trigger SCSI_MLQUEUE_HOST_BUSY when lp->qfull is set to prevent more scsi-ml cmds while lp->qfull is set. This patch eliminated FC_LINK_UP and FC_PAUSE and instead used dedicated fields in fc_lport for this, this simplified all related conditional code. Also removed fc_pause and fc_unpause functions and instead used newly added lport->qfull directly in fcoe. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | [SCSI] libfc: Pass lport in exch_mgr_resetAbhijeet Joglekar2009-03-061-2/+2
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fc_exch_mgr structure is private to fc_exch.c. To export exch_mgr_reset to transport, transport needs access to the exch manager. Change exch_mgr_reset to use lport param which is the shared structure between libFC and transport. Alternatively, fc_exch_mgr definition can be moved to libfc.h so that lport can be accessed from mp*. Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* | | Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds2009-03-143-0/+13
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined... SUNRPC: xprt_connect() don't abort the task if the transport isn't bound SUNRPC: Fix an Oops due to socket not set up yet... Bug 11061, NFS mounts dropped NFS: Handle -ESTALE error in access() NLM: Fix GRANT callback address comparison when IPv6 is enabled NLM: Shrink the IPv4-only version of nlm_cmp_addr() NFSv3: Fix posix ACL code NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2) SUNRPC: Tighten up the task locking rules in __rpc_execute()
| * | | NLM: Shrink the IPv4-only version of nlm_cmp_addr()Chuck Lever2009-03-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up/micro-optimatization: Make the AF_INET-only version of nlm_cmp_addr() smaller. This matches the style of nlm_privileged_requester(), and makes the AF_INET-only version of nlm_cmp_addr() nearly the same size as it was before IPv6 support. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * | | NFSv3: Fix posix ACL codeTrond Myklebust2009-03-102-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a memory leak due to allocation in the XDR layer. In cases where the RPC call needs to be retransmitted, we end up allocating new pages without clearing the old ones. Fix this by moving the allocation into nfs3_proc_setacls(). Also fix an issue discovered by Kevin Rudd, whereby the amount of memory reserved for the acls in the xdr_buf->head was miscalculated, and causing corruption. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | | | ide: save the returned value of dma_map_sgFUJITA Tomonori2009-03-131-0/+1
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dma_map_sg could return a value different to 'nents' argument of dma_map_sg so the ide stack needs to save it for the later usage (e.g. for_each_sg). The ide stack also needs to save the original sg_nents value for pci_unmap_sg. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [bart: backport to Linus' tree] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds2009-03-122-0/+6
|\ \ \ | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: cpumask: mm_cpumask for accessing the struct mm_struct's cpu_vm_mask. cpumask: tsk_cpumask for accessing the struct task_struct's cpus_allowed.
| * | | cpumask: mm_cpumask for accessing the struct mm_struct's cpu_vm_mask.Rusty Russell2009-03-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to change the representation (to a dangling bitmap or cpumask_var_t) without breaking all the callers: they can use mm_cpumask() now and won't see a difference as the changes roll into linux-next. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
| * | | cpumask: tsk_cpumask for accessing the struct task_struct's cpus_allowed.Rusty Russell2009-03-121-0/+3
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | This allows us to change the representation (to a dangling bitmap or cpumask_var_t) without breaking all the callers: they can use tsk_cpumask() now and won't see a difference as the changes roll into linux-next. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | | drm: fix EDID parser problem with positive/negative hsync/vsyncPantelis Koukousoulas2009-03-111-1/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | Comparing the layouts of struct detail_pixel_timing with x.org's struct detailed_timings and how those are handled, it appears that the hsync_positive and vsync_positive fields are backwards. This patch fixes https://bugs.freedesktop.org/show_bug.cgi?id=20019 for me. It was tested on 2 monitors, LG FLATRON L225WS 22" and a YAKUMO 17" for which more details are unknown. Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
* | Merge branch 'fixes' of ↵Linus Torvalds2009-03-091-1/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule. Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."
| * | Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."Dave Jones2009-03-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit e088e4c9cdb618675874becb91b2fd581ee707e6. Removing the sysfs interface for p4-clockmod was flagged as a regression in bug 12826. Course of action: - Find out the remaining causes of overheating, and fix them if possible. ACPI should be doing the right thing automatically. If it isn't, we need to fix that. - mark p4-clockmod ui as deprecated - try again with the removal in six months. It's not really feasible to printk about the deprecation, because it needs to happen at all the sysfs entry points, which means adding a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch. Signed-off-by: Dave Jones <davej@redhat.com>
OpenPOWER on IntegriCloud