summaryrefslogtreecommitdiffstats
path: root/include/uapi/linux
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | | | | | | | btrfs: use common vfs LABEL ioctl definitionsEric Sandeen2019-09-091-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I lifted the btrfs label get/set ioctls to the vfs some time ago, but never followed up to use those common definitions directly in btrfs. This patch does that. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | | | | | | | | | | | Merge tag 'fsverity-for-linus' of ↵Linus Torvalds2019-09-182-0/+41
|\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fs-verity support from Eric Biggers: "fs-verity is a filesystem feature that provides Merkle tree based hashing (similar to dm-verity) for individual readonly files, mainly for the purpose of efficient authenticity verification. This pull request includes: (a) The fs/verity/ support layer and documentation. (b) fs-verity support for ext4 and f2fs. Compared to the original fs-verity patchset from last year, the UAPI to enable fs-verity on a file has been greatly simplified. Lots of other things were cleaned up too. fs-verity is planned to be used by two different projects on Android; most of the userspace code is in place already. Another userspace tool ("fsverity-utils"), and xfstests, are also available. e2fsprogs and f2fs-tools already have fs-verity support. Other people have shown interest in using fs-verity too. I've tested this on ext4 and f2fs with xfstests, both the existing tests and the new fs-verity tests. This has also been in linux-next since July 30 with no reported issues except a couple minor ones I found myself and folded in fixes for. Ted and I will be co-maintaining fs-verity" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: f2fs: add fs-verity support ext4: update on-disk format documentation for fs-verity ext4: add fs-verity read support ext4: add basic fs-verity support fs-verity: support builtin file signatures fs-verity: add SHA-512 support fs-verity: implement FS_IOC_MEASURE_VERITY ioctl fs-verity: implement FS_IOC_ENABLE_VERITY ioctl fs-verity: add data verification hooks for ->readpages() fs-verity: add the hook for file ->setattr() fs-verity: add the hook for file ->open() fs-verity: add inode and superblock fields fs-verity: add Kconfig and the helper functions for hashing fs: uapi: define verity bit for FS_IOC_GETFLAGS fs-verity: add UAPI header fs-verity: add MAINTAINERS file entry fs-verity: add a documentation file
| * | | | | | | | | | | | fs-verity: add SHA-512 supportEric Biggers2019-08-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SHA-512 support to fs-verity. This is primarily a demonstration of the trivial changes needed to support a new hash algorithm in fs-verity; most users will still use SHA-256, due to the smaller space required to store the hashes. But some users may prefer SHA-512. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fs: uapi: define verity bit for FS_IOC_GETFLAGSEric Biggers2019-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add FS_VERITY_FL to the flags for FS_IOC_GETFLAGS, so that applications can easily determine whether a file is a verity file at the same time as they're checking other file flags. This flag will be gettable only; FS_IOC_SETFLAGS won't allow setting it, since an ioctl must be used instead to provide more parameters. This flag matches the on-disk bit that was already allocated for ext4. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fs-verity: add UAPI headerEric Biggers2019-07-281-0/+39
| | |_|_|_|_|_|_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the UAPI header for fs-verity, including two ioctls: - FS_IOC_ENABLE_VERITY - FS_IOC_MEASURE_VERITY These ioctls are documented in the "User API" section of Documentation/filesystems/fsverity.rst. Examples of using these ioctls can be found in fsverity-utils (https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/fsverity-utils.git). I've also written xfstests that test these ioctls (https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/xfstests-dev.git/log/?h=fsverity). Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>
* | | | | | | | | | | | Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds2019-09-182-51/+184
|\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull fscrypt updates from Eric Biggers: "This is a large update to fs/crypto/ which includes: - Add ioctls that add/remove encryption keys to/from a filesystem-level keyring. These fix user-reported issues where e.g. an encrypted home directory can break NetworkManager, sshd, Docker, etc. because they don't get access to the needed keyring. These ioctls also provide a way to lock encrypted directories that doesn't use the vm.drop_caches sysctl, so is faster, more reliable, and doesn't always need root. - Add a new encryption policy version ("v2") which switches to a more standard, secure, and flexible key derivation function, and starts verifying that the correct key was supplied before using it. The key derivation improvement is needed for its own sake as well as for ongoing feature work for which the current way is too inflexible. Work is in progress to update both Android and the 'fscrypt' userspace tool to use both these features. (Working patches are available and just need to be reviewed+merged.) Chrome OS will likely use them too. This has also been tested on ext4, f2fs, and ubifs with xfstests -- both the existing encryption tests, and the new tests for this. This has also been in linux-next since Aug 16 with no reported issues. I'm also using an fscrypt v2-encrypted home directory on my personal desktop" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: (27 commits) ext4 crypto: fix to check feature status before get policy fscrypt: document the new ioctls and policy version ubifs: wire up new fscrypt ioctls f2fs: wire up new fscrypt ioctls ext4: wire up new fscrypt ioctls fscrypt: require that key be added when setting a v2 encryption policy fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl fscrypt: allow unprivileged users to add/remove keys for v2 policies fscrypt: v2 encryption policy support fscrypt: add an HKDF-SHA512 implementation fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl fscrypt: rename keyinfo.c to keysetup.c fscrypt: move v1 policy key setup to keysetup_v1.c fscrypt: refactor key setup code in preparation for v2 policies fscrypt: rename fscrypt_master_key to fscrypt_direct_key fscrypt: add ->ci_inode to fscrypt_info fscrypt: use FSCRYPT_* definitions, not FS_* fscrypt: use FSCRYPT_ prefix for uapi constants ...
| * | | | | | | | | | | | fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctlEric Biggers2019-08-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a root-only variant of the FS_IOC_REMOVE_ENCRYPTION_KEY ioctl which removes all users' claims of the key, not just the current user's claim. I.e., it always removes the key itself, no matter how many users have added it. This is useful for forcing a directory to be locked, without having to figure out which user ID(s) the key was added under. This is planned to be used by a command like 'sudo fscrypt lock DIR --all-users' in the fscrypt userspace tool (http://github.com/google/fscrypt). Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fscrypt: allow unprivileged users to add/remove keys for v2 policiesEric Biggers2019-08-121-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow the FS_IOC_ADD_ENCRYPTION_KEY and FS_IOC_REMOVE_ENCRYPTION_KEY ioctls to be used by non-root users to add and remove encryption keys from the filesystem-level crypto keyrings, subject to limitations. Motivation: while privileged fscrypt key management is sufficient for some users (e.g. Android and Chromium OS, where a privileged process manages all keys), the old API by design also allows non-root users to set up and use encrypted directories, and we don't want to regress on that. Especially, we don't want to force users to continue using the old API, running into the visibility mismatch between files and keyrings and being unable to "lock" encrypted directories. Intuitively, the ioctls have to be privileged since they manipulate filesystem-level state. However, it's actually safe to make them unprivileged if we very carefully enforce some specific limitations. First, each key must be identified by a cryptographic hash so that a user can't add the wrong key for another user's files. For v2 encryption policies, we use the key_identifier for this. v1 policies don't have this, so managing keys for them remains privileged. Second, each key a user adds is charged to their quota for the keyrings service. Thus, a user can't exhaust memory by adding a huge number of keys. By default each non-root user is allowed up to 200 keys; this can be changed using the existing sysctl 'kernel.keys.maxkeys'. Third, if multiple users add the same key, we keep track of those users of the key (of which there remains a single copy), and won't really remove the key, i.e. "lock" the encrypted files, until all those users have removed it. This prevents denial of service attacks that would be possible under simpler schemes, such allowing the first user who added a key to remove it -- since that could be a malicious user who has compromised the key. Of course, encryption keys should be kept secret, but the idea is that using encryption should never be *less* secure than not using encryption, even if your key was compromised. We tolerate that a user will be unable to really remove a key, i.e. unable to "lock" their encrypted files, if another user has added the same key. But in a sense, this is actually a good thing because it will avoid providing a false notion of security where a key appears to have been removed when actually it's still in memory, available to any attacker who compromises the operating system kernel. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fscrypt: v2 encryption policy supportEric Biggers2019-08-121-8/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new fscrypt policy version, "v2". It has the following changes from the original policy version, which we call "v1" (*): - Master keys (the user-provided encryption keys) are only ever used as input to HKDF-SHA512. This is more flexible and less error-prone, and it avoids the quirks and limitations of the AES-128-ECB based KDF. Three classes of cryptographically isolated subkeys are defined: - Per-file keys, like used in v1 policies except for the new KDF. - Per-mode keys. These implement the semantics of the DIRECT_KEY flag, which for v1 policies made the master key be used directly. These are also planned to be used for inline encryption when support for it is added. - Key identifiers (see below). - Each master key is identified by a 16-byte master_key_identifier, which is derived from the key itself using HKDF-SHA512. This prevents users from associating the wrong key with an encrypted file or directory. This was easily possible with v1 policies, which identified the key by an arbitrary 8-byte master_key_descriptor. - The key must be provided in the filesystem-level keyring, not in a process-subscribed keyring. The following UAPI additions are made: - The existing ioctl FS_IOC_SET_ENCRYPTION_POLICY can now be passed a fscrypt_policy_v2 to set a v2 encryption policy. It's disambiguated from fscrypt_policy/fscrypt_policy_v1 by the version code prefix. - A new ioctl FS_IOC_GET_ENCRYPTION_POLICY_EX is added. It allows getting the v1 or v2 encryption policy of an encrypted file or directory. The existing FS_IOC_GET_ENCRYPTION_POLICY ioctl could not be used because it did not have a way for userspace to indicate which policy structure is expected. The new ioctl includes a size field, so it is extensible to future fscrypt policy versions. - The ioctls FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY, and FS_IOC_GET_ENCRYPTION_KEY_STATUS now support managing keys for v2 encryption policies. Such keys are kept logically separate from keys for v1 encryption policies, and are identified by 'identifier' rather than by 'descriptor'. The 'identifier' need not be provided when adding a key, since the kernel will calculate it anyway. This patch temporarily keeps adding/removing v2 policy keys behind the same permission check done for adding/removing v1 policy keys: capable(CAP_SYS_ADMIN). However, the next patch will carefully take advantage of the cryptographically secure master_key_identifier to allow non-root users to add/remove v2 policy keys, thus providing a full replacement for v1 policies. (*) Actually, in the API fscrypt_policy::version is 0 while on-disk fscrypt_context::format is 1. But I believe it makes the most sense to advance both to '2' to have them be in sync, and to consider the numbering to start at 1 except for the API quirk. Reviewed-by: Paul Crowley <paulcrowley@google.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctlEric Biggers2019-08-121-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new fscrypt ioctl, FS_IOC_GET_ENCRYPTION_KEY_STATUS. Given a key specified by 'struct fscrypt_key_specifier' (the same way a key is specified for the other fscrypt key management ioctls), it returns status information in a 'struct fscrypt_get_key_status_arg'. The main motivation for this is that applications need to be able to check whether an encrypted directory is "unlocked" or not, so that they can add the key if it is not, and avoid adding the key (which may involve prompting the user for a passphrase) if it already is. It's possible to use some workarounds such as checking whether opening a regular file fails with ENOKEY, or checking whether the filenames "look like gibberish" or not. However, no workaround is usable in all cases. Like the other key management ioctls, the keyrings syscalls may seem at first to be a good fit for this. Unfortunately, they are not. Even if we exposed the keyring ID of the ->s_master_keys keyring and gave everyone Search permission on it (note: currently the keyrings permission system would also allow everyone to "invalidate" the keyring too), the fscrypt keys have an additional state that doesn't map cleanly to the keyrings API: the secret can be removed, but we can be still tracking the files that were using the key, and the removal can be re-attempted or the secret added again. After later patches, some applications will also need a way to determine whether a key was added by the current user vs. by some other user. Reserved fields are included in fscrypt_get_key_status_arg for this and other future extensions. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctlEric Biggers2019-08-121-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY. It wipes the secret key itself, then "locks" the encrypted files and directories that had been unlocked using that key -- implemented by evicting the relevant dentries and inodes from the VFS caches. The problem this solves is that many fscrypt users want the ability to remove encryption keys, causing the corresponding encrypted directories to appear "locked" (presented in ciphertext form) again. Moreover, users want removing an encryption key to *really* remove it, in the sense that the removed keys cannot be recovered even if kernel memory is compromised, e.g. by the exploit of a kernel security vulnerability or by a physical attack. This is desirable after a user logs out of the system, for example. In many cases users even already assume this to be the case and are surprised to hear when it's not. It is not sufficient to simply unlink the master key from the keyring (or to revoke or invalidate it), since the actual encryption transform objects are still pinned in memory by their inodes. Therefore, to really remove a key we must also evict the relevant inodes. Currently one workaround is to run 'sync && echo 2 > /proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the system rather than just the inodes associated with the key being removed, causing severe performance problems. Moreover, it requires root privileges, so regular users can't "lock" their encrypted files. Another workaround, used in Chromium OS kernels, is to add a new VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of drop_caches that operates on a single super_block. It does: shrink_dcache_sb(sb); invalidate_inodes(sb, false); But it's still a hack. Yet, the major users of filesystem encryption want this feature badly enough that they are actually using these hacks. To properly solve the problem, start maintaining a list of the inodes which have been "unlocked" using each master key. Originally this wasn't possible because the kernel didn't keep track of in-use master keys at all. But, with the ->s_master_keys keyring it is now possible. Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified master key in ->s_master_keys, then wipes the secret key itself, which prevents any additional inodes from being unlocked with the key. Then, it syncs the filesystem and evicts the inodes in the key's list. The normal inode eviction code will free and wipe the per-file keys (in ->i_crypt_info). Note that freeing ->i_crypt_info without evicting the inodes was also considered, but would have been racy. Some inodes may still be in use when a master key is removed, and we can't simply revoke random file descriptors, mmap's, etc. Thus, the ioctl simply skips in-use inodes, and returns -EBUSY to indicate that some inodes weren't evicted. The master key *secret* is still removed, but the fscrypt_master_key struct remains to keep track of the remaining inodes. Userspace can then retry the ioctl to evict the remaining inodes. Alternatively, if userspace adds the key again, the refreshed secret will be associated with the existing list of inodes so they remain correctly tracked for future key removals. The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a kernel compromise some portions of plaintext file contents may still be recoverable from memory. This can be solved by enabling page poisoning system-wide, which security conscious users may choose to do. But it's very difficult to solve otherwise, e.g. note that plaintext file contents may have been read in other places than pagecache pages. Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is initially restricted to privileged users only. This is sufficient for some use cases, but not all. A later patch will relax this restriction, but it will require introducing key hashes, among other changes. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctlEric Biggers2019-08-121-10/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY. This ioctl adds an encryption key to the filesystem's fscrypt keyring ->s_master_keys, making any files encrypted with that key appear "unlocked". Why we need this ~~~~~~~~~~~~~~~~ The main problem is that the "locked/unlocked" (ciphertext/plaintext) status of encrypted files is global, but the fscrypt keys are not. fscrypt only looks for keys in the keyring(s) the process accessing the filesystem is subscribed to: the thread keyring, process keyring, and session keyring, where the session keyring may contain the user keyring. Therefore, userspace has to put fscrypt keys in the keyrings for individual users or sessions. But this means that when a process with a different keyring tries to access encrypted files, whether they appear "unlocked" or not is nondeterministic. This is because it depends on whether the files are currently present in the inode cache. Fixing this by consistently providing each process its own view of the filesystem depending on whether it has the key or not isn't feasible due to how the VFS caches work. Furthermore, while sometimes users expect this behavior, it is misguided for two reasons. First, it would be an OS-level access control mechanism largely redundant with existing access control mechanisms such as UNIX file permissions, ACLs, LSMs, etc. Encryption is actually for protecting the data at rest. Second, almost all users of fscrypt actually do need the keys to be global. The largest users of fscrypt, Android and Chromium OS, achieve this by having PID 1 create a "session keyring" that is inherited by every process. This works, but it isn't scalable because it prevents session keyrings from being used for any other purpose. On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't similarly abuse the session keyring, so to make 'sudo' work on all systems it has to link all the user keyrings into root's user keyring [2]. This is ugly and raises security concerns. Moreover it can't make the keys available to system services, such as sshd trying to access the user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to read certificates from the user's home directory (see [5]); or to Docker containers (see [6], [7]). By having an API to add a key to the *filesystem* we'll be able to fix the above bugs, remove userspace workarounds, and clearly express the intended semantics: the locked/unlocked status of an encrypted directory is global, and encryption is orthogonal to OS-level access control. Why not use the add_key() syscall ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We use an ioctl for this API rather than the existing add_key() system call because the ioctl gives us the flexibility needed to implement fscrypt-specific semantics that will be introduced in later patches: - Supporting key removal with the semantics such that the secret is removed immediately and any unused inodes using the key are evicted; also, the eviction of any in-use inodes can be retried. - Calculating a key-dependent cryptographic identifier and returning it to userspace. - Allowing keys to be added and removed by non-root users, but only keys for v2 encryption policies; and to prevent denial-of-service attacks, users can only remove keys they themselves have added, and a key is only really removed after all users who added it have removed it. Trying to shoehorn these semantics into the keyrings syscalls would be very difficult, whereas the ioctls make things much easier. However, to reuse code the implementation still uses the keyrings service internally. Thus we get lockless RCU-mode key lookups without having to re-implement it, and the keys automatically show up in /proc/keys for debugging purposes. References: [1] https://github.com/google/fscrypt [2] https://goo.gl/55cCrI#heading=h.vf09isp98isb [3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939 [4] https://github.com/google/fscrypt/issues/116 [5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715 [6] https://github.com/google/fscrypt/issues/128 [7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fscrypt: use FSCRYPT_* definitions, not FS_*Eric Biggers2019-08-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update fs/crypto/ to use the new names for the UAPI constants rather than the old names, then make the old definitions conditional on !__KERNEL__. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fscrypt: use FSCRYPT_ prefix for uapi constantsEric Biggers2019-08-121-23/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prefix all filesystem encryption UAPI constants except the ioctl numbers with "FSCRYPT_" rather than with "FS_". This namespaces the constants more appropriately and makes it clear that they are related specifically to the filesystem encryption feature, and to the 'fscrypt_*' structures. With some of the old names like "FS_POLICY_FLAGS_VALID", it was not immediately clear that the constant had anything to do with encryption. This is also useful because we'll be adding more encryption-related constants, e.g. for the policy version, and we'd otherwise have to choose whether to use unclear names like FS_POLICY_V1 or inconsistent names like FS_ENCRYPTION_POLICY_V1. For source compatibility with existing userspace programs, keep the old names defined as aliases to the new names. Finally, as long as new names are being defined anyway, I skipped defining new names for the fscrypt mode numbers that aren't actually used: INVALID (0), AES_256_GCM (2), AES_256_CBC (3), SPECK128_256_XTS (7), and SPECK128_256_CTS (8). Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
| * | | | | | | | | | | | fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>Eric Biggers2019-08-122-51/+64
| | |_|_|_|/ / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More fscrypt definitions are being added, and we shouldn't use a disproportionate amount of space in <linux/fs.h> for fscrypt stuff. So move the fscrypt definitions to a new header <linux/fscrypt.h>. For source compatibility with existing userspace programs, <linux/fs.h> still includes the new header. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
* | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds2019-09-1825-19/+525
|\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Support IPV6 RA Captive Portal Identifier, from Maciej Żenczykowski. 2) Use bio_vec in the networking instead of custom skb_frag_t, from Matthew Wilcox. 3) Make use of xmit_more in r8169 driver, from Heiner Kallweit. 4) Add devmap_hash to xdp, from Toke Høiland-Jørgensen. 5) Support all variants of 5750X bnxt_en chips, from Michael Chan. 6) More RTNL avoidance work in the core and mlx5 driver, from Vlad Buslov. 7) Add TCP syn cookies bpf helper, from Petar Penkov. 8) Add 'nettest' to selftests and use it, from David Ahern. 9) Add extack support to drop_monitor, add packet alert mode and support for HW drops, from Ido Schimmel. 10) Add VLAN offload to stmmac, from Jose Abreu. 11) Lots of devm_platform_ioremap_resource() conversions, from YueHaibing. 12) Add IONIC driver, from Shannon Nelson. 13) Several kTLS cleanups, from Jakub Kicinski. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1930 commits) mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer mlxsw: spectrum: Register CPU port with devlink mlxsw: spectrum_buffers: Prevent changing CPU port's configuration net: ena: fix incorrect update of intr_delay_resolution net: ena: fix retrieval of nonadaptive interrupt moderation intervals net: ena: fix update of interrupt moderation register net: ena: remove all old adaptive rx interrupt moderation code from ena_com net: ena: remove ena_restore_ethtool_params() and relevant fields net: ena: remove old adaptive interrupt moderation code from ena_netdev net: ena: remove code duplication in ena_com_update_nonadaptive_moderation_interval _*() net: ena: enable the interrupt_moderation in driver_supported_features net: ena: reimplement set/get_coalesce() net: ena: switch to dim algorithm for rx adaptive interrupt moderation net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable ethtool: implement Energy Detect Powerdown support via phy-tunable xen-netfront: do not assume sk_buff_head list is empty in error handling s390/ctcm: Delete unnecessary checks before the macro call “dev_kfree_skb” net: ena: don't wake up tx queue when down drop_monitor: Better sanitize notified packets ...
| * | | | | | | | | | | | ethtool: implement Energy Detect Powerdown support via phy-tunableAlexandru Ardelean2019-09-161-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like this feature is common across other PHYs (like EEE), and defining `ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long. The way EDPD works, is that the RX block is put to a lower power mode, except for link-pulse detection circuits. The TX block is also put to low power mode, but the PHY wakes-up periodically to send link pulses, to avoid lock-ups in case the other side is also in EDPD mode. Currently, there are 2 PHY drivers that look like they could use this new PHY tunable feature: the `adin` && `micrel` PHYs. The ADIN's datasheet mentions that TX pulses are at intervals of 1 second default each, and they can be disabled. For the Micrel KSZ9031 PHY, the datasheet does not mention whether they can be disabled, but mentions that they can modified. The way this change is structured, is similar to the PHY tunable downshift control: * a `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` value is exposed to cover a default TX interval; some PHYs could specify a certain value that makes sense * `ETHTOOL_PHY_EDPD_NO_TX` would disable TX when EDPD is enabled * `ETHTOOL_PHY_EDPD_DISABLE` will disable EDPD As noted by the `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` the interval unit is 1 millisecond, which should cover a reasonable range of intervals: - from 1 millisecond, which does not sound like much of a power-saver - to ~65 seconds which is quite a lot to wait for a link to come up when plugging a cable Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | taprio: Add support for hardware offloadingVinicius Costa Gomes2019-09-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows taprio to offload the schedule enforcement to capable network cards, resulting in more precise windows and less CPU usage. The gate mask acts on traffic classes (groups of queues of same priority), as specified in IEEE 802.1Q-2018, and following the existing taprio and mqprio semantics. It is up to the driver to perform conversion between tc and individual netdev queues if for some reason it needs to make that distinction. Full offload is requested from the network interface by specifying "flags 2" in the tc qdisc creation command, which in turn corresponds to the TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD bit. The important detail here is the clockid which is implicitly /dev/ptpN for full offload, and hence not configurable. A reference counting API is added to support the use case where Ethernet drivers need to keep the taprio offload structure locally (i.e. they are a multi-port switch driver, and configuring a port depends on the settings of other ports as well). The refcount_t variable is kept in a private structure (__tc_taprio_qopt_offload) and not exposed to drivers. In the future, the private structure might also be expanded with a backpointer to taprio_sched *q, to implement the notification system described in the patch (of when admin became oper, or an error occurred, etc, so the offload can be monitored with 'tc qdisc show'). Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | tcp: Add snd_wnd to TCP_INFOThomas Higdon2019-09-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Neal Cardwell mentioned that snd_wnd would be useful for diagnosing TCP performance problems -- > (1) Usually when we're diagnosing TCP performance problems, we do so > from the sender, since the sender makes most of the > performance-critical decisions (cwnd, pacing, TSO size, TSQ, etc). > From the sender-side the thing that would be most useful is to see > tp->snd_wnd, the receive window that the receiver has advertised to > the sender. This serves the purpose of adding an additional __u32 to avoid the would-be hole caused by the addition of the tcpi_rcvi_ooopack field. Signed-off-by: Thomas Higdon <tph@fb.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | tcp: Add TCP_INFO counter for packets received out-of-orderThomas Higdon2019-09-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For receive-heavy cases on the server-side, we want to track the connection quality for individual client IPs. This counter, similar to the existing system-wide TCPOFOQueue counter in /proc/net/netstat, tracks out-of-order packet reception. By providing this counter in TCP_INFO, it will allow understanding to what degree receive-heavy sockets are experiencing out-of-order delivery and packet drops indicating congestion. Please note that this is similar to the counter in NetBSD TCP_INFO, and has the same name. Also note that we avoid increasing the size of the tcp_sock struct by taking advantage of a hole. Signed-off-by: Thomas Higdon <tph@fb.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2019-09-151-0/+1
| |\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minor overlapping changes in the btusb and ixgbe drivers. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | net: devlink: move reload fail indication to devlink core and expose to userJiri Pirko2019-09-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the fact that devlink reload failed is stored in drivers. Move this flag into devlink core. Also, expose it to the user. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | PTP: add support for one-shot outputFelipe Balbi2019-09-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some controllers allow for a one-shot output pulse, in contrast to periodic output. Now that we have extensible versions of our IOCTLs, we can finally make use of the 'flags' field to pass a bit telling driver that if we want one-shot pulse output. Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Reviewed-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | PTP: introduce new versions of IOCTLsFelipe Balbi2019-09-131-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current version of the IOCTL have a small problem which prevents us from extending the API by making use of reserved fields. In these new IOCTLs, we are now making sure that flags and rsv fields are zero which will allow us to extend the API in the future. Reviewed-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2019-09-131-1/+2
| |\ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Fix error path of nf_tables_updobj(), from Dan Carpenter. 2) Move large structure away from stack in the nf_tables offload infrastructure, from Arnd Bergmann. 3) Move indirect flow_block logic to nf_tables_offload. 4) Support for synproxy objects, from Fernando Fernandez Mancera. 5) Support for fwd and dup offload. 6) Add __nft_offload_get_chain() helper, this implicitly fixes missing mutex and check for offload flags in the indirect block support, patch from wenxu. 7) Remove rules on device unregistration, from wenxu. This includes two preparation patches to reuse nft_flow_offload_chain() and nft_flow_offload_rule(). Large batch from Jeremy Sowden to make a second pass to the CONFIG_HEADER_TEST support and a bit of housekeeping: 8) Missing include guard in conntrack label header, from Jeremy Sowden. 9) A few coding style errors: trailing whitespace, incorrect indent in Kconfig, and semicolons at the end of function definitions. 10) Remove unused ipt_init() and ip6t_init() declarations. 11) Inline xt_hashlimit, ebt_802_3 and xt_physdev headers. They are only used once. 12) Update include directive in several netfilter files. 13) Remove unused include/net/netfilter/ipv6/nf_conntrack_icmpv6.h. 14) Move nf_ip6_ext_hdr() to include/linux/netfilter_ipv6.h 15) Move several synproxy structure definitions to nf_synproxy.h 16) Move nf_bridge_frag_data structure to include/linux/netfilter_bridge.h 17) Clean up static inline definitions in nf_conntrack_ecache.h. 18) Replace defined(CONFIG...) || defined(CONFIG...MODULE) with IS_ENABLED(CONFIG...). 19) Missing inline function conditional definitions based on Kconfig preferences in synproxy and nf_conntrack_timeout. 20) Update br_nf_pre_routing_ipv6() definition. 21) Move conntrack code in linux/skbuff.h to nf_conntrack headers. 22) Several patches to remove superfluous CONFIG_NETFILTER and CONFIG_NF_CONNTRACK checks in headers, coming from the initial batch support for CONFIG_HEADER_TEST for netfilter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | | | | | netfilter: nft_synproxy: add synproxy stateful object supportFernando Fernandez Mancera2019-09-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Register a new synproxy stateful object type into the stateful object infrastructure. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | | | | | | | | | | | devlink: add unknown 'fw_load_policy' valueDirk van der Merwe2019-09-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to the 'reset_dev_on_drv_probe' devlink parameter, it is useful to have an unknown value which can be used by drivers to report that the hardware value isn't recognized or is otherwise invalid instead of failing the operation. This is especially useful for u8/enum parameters. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | | Merge tag 'mac80211-next-for-davem-2019-09-11' of ↵David S. Miller2019-09-111-0/+3
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== We have a number of changes, but things are settling down: * a fix in the new 6 GHz channel support * a fix for recent minstrel (rate control) updates for an infinite loop * handle interface type changes better wrt. management frame registrations (for management frames sent to userspace) * add in-BSS RX time to survey information * handle HW rfkill properly if !CONFIG_RFKILL * send deauth on IBSS station expiry, to avoid state mismatches * handle deferred crypto tailroom updates in mac80211 better when device restart happens * fix a spectre-v1 - really a continuation of a previous patch * advertise NL80211_CMD_UPDATE_FT_IES as supported if so * add some missing parsing in VHT extended NSS support * support HE in mac80211_hwsim * let mac80211 drivers determine the max MTU themselves along with the usual cleanups etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | | | | | | cfg80211: add local BSS receive time to survey informationFelix Fietkau2019-08-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is useful for checking how much airtime is being used up by other transmissions on the channel, e.g. by calculating (time_rx - time_bss_rx) or (time_busy - time_bss_rx - time_tx) Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20190828102042.58016-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | | | | | | | | | | | devlink: add 'reset_dev_on_drv_probe' paramDirk van der Merwe2019-09-101-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the 'reset_dev_on_drv_probe' devlink parameter, controlling the device reset policy on driver probe. This parameter is useful in conjunction with the existing 'fw_load_policy' parameter. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | | | devlink: extend 'fw_load_policy' valuesDirk van der Merwe2019-09-101-0/+1
| | |/ / / / / / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the 'disk' value to the generic 'fw_load_policy' devlink parameter. This value indicates that firmware should always be loaded from disk only. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2019-09-072-0/+18
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Add nft_reg_store64() and nft_reg_load64() helpers, from Ander Juaristi. 2) Time matching support, also from Ander Juaristi. 3) VLAN support for nfnetlink_log, from Michael Braun. 4) Support for set element deletions from the packet path, also from Ander. 5) Remove __read_mostly from conntrack spinlock, from Li RongQing. 6) Support for updating stateful objects, this also includes the initial client for this infrastructure: the quota extension. A follow up fix for the control plane also comes in this batch. Patches from Fernando Fernandez Mancera. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | | | | | | netfilter: nft_dynset: support for element deletionAnder Juaristi2019-08-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the delete operation from the ruleset. It implements a new delete() function in nft_set_rhash. It is simpler to use than the already existing remove(), because it only takes the set and the key as arguments, whereas remove() expects a full nft_set_elem structure. Signed-off-by: Ander Juaristi <a@juaristi.eus> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | | | | | | | | | | netfilter: nfnetlink_log: add support for VLAN informationMichael Braun2019-08-261-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, there is no vlan information (e.g. when used with a vlan aware bridge) passed to userspache, HWHEADER will contain an 08 00 (ip) suffix even for tagged ip packets. Therefore, add an extra netlink attribute that passes the vlan information to userspace similarly to 15824ab29f for nfqueue. Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | | | | | | | | | | | netfilter: nft_meta: support for time matchingAnder Juaristi2019-08-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces meta matches in the kernel for time (a UNIX timestamp), day (a day of week, represented as an integer between 0-6), and hour (an hour in the current day, or: number of seconds since midnight). All values are taken as unsigned 64-bit integers. The 'time' keyword is internally converted to nanoseconds by nft in userspace, and hence the timestamp is taken in nanoseconds as well. Signed-off-by: Ander Juaristi <a@juaristi.eus> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | | | | | | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2019-09-062-3/+34
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add the ability to use unaligned chunks in the AF_XDP umem. By relaxing where the chunks can be placed, it allows to use an arbitrary buffer size and place whenever there is a free address in the umem. Helps more seamless DPDK AF_XDP driver integration. Support for i40e, ixgbe and mlx5e, from Kevin and Maxim. 2) Addition of a wakeup flag for AF_XDP tx and fill rings so the application can wake up the kernel for rx/tx processing which avoids busy-spinning of the latter, useful when app and driver is located on the same core. Support for i40e, ixgbe and mlx5e, from Magnus and Maxim. 3) bpftool fixes for printf()-like functions so compiler can actually enforce checks, bpftool build system improvements for custom output directories, and addition of 'bpftool map freeze' command, from Quentin. 4) Support attaching/detaching XDP programs from 'bpftool net' command, from Daniel. 5) Automatic xskmap cleanup when AF_XDP socket is released, and several barrier/{read,write}_once fixes in AF_XDP code, from Björn. 6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf inclusion as well as libbpf versioning improvements, from Andrii. 7) Several new BPF kselftests for verifier precision tracking, from Alexei. 8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya. 9) And more BPF kselftest improvements all over the place, from Stanislav. 10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub. 11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan. 12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni. 13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin. 14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar. 15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari, Peter, Wei, Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | | | | | | | | | | | xsk: add support to allow unaligned chunk placementKevin Laatz2019-08-311-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, addresses are chunk size aligned. This means, we are very restricted in terms of where we can place chunk within the umem. For example, if we have a chunk size of 2k, then our chunks can only be placed at 0,2k,4k,6k,8k... and so on (ie. every 2k starting from 0). This patch introduces the ability to use unaligned chunks. With these changes, we are no longer bound to having to place chunks at a 2k (or whatever your chunk size is) interval. Since we are no longer dealing with aligned chunks, they can now cross page boundaries. Checks for page contiguity have been added in order to keep track of which pages are followed by a physically contiguous page. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * | | | | | | | | | | | | | | bpf: introduce verifier internal test flagAlexei Starovoitov2019-08-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce BPF_F_TEST_STATE_FREQ flag to stress test parentage chain and state pruning. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * | | | | | | | | | | | | | | bpf: clarify when bpf_trace_printk discards linesPeter Wu2019-08-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I opened /sys/kernel/tracing/trace once and kept reading from it. bpf_trace_printk somehow did not seem to work, no entries were appended to that trace file. It turns out that tracing is disabled when that file is open. Save the next person some time and document this. The trace file is described in Documentation/trace/ftrace.rst, however the implication "tracing is disabled" did not immediate translate to "bpf_trace_printk silently discards entries". Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | | | | | | | | | | | | | | bpf: fix 'struct pt_reg' typo in documentationPeter Wu2019-08-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no 'struct pt_reg'. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | | | | | | | | | | | | | | bpf: add new BPF_BTF_GET_NEXT_ID syscall commandQuentin Monnet2019-08-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new command for the bpf() system call: BPF_BTF_GET_NEXT_ID is used to cycle through all BTF objects loaded on the system. The motivation is to be able to inspect (list) all BTF objects presents on the system. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | | | | | | | | | | | | | | bpf: support cloning sk storage on accept()Stanislav Fomichev2019-08-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new helper bpf_sk_storage_clone which optionally clones sk storage and call it from sk_clone_lock. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * | | | | | | | | | | | | | | xsk: add support for need_wakeup flag in AF_XDP ringsMagnus Karlsson2019-08-171-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds support for a new flag called need_wakeup in the AF_XDP Tx and fill rings. When this flag is set, it means that the application has to explicitly wake up the kernel Rx (for the bit in the fill ring) or kernel Tx (for bit in the Tx ring) processing by issuing a syscall. Poll() can wake up both depending on the flags submitted and sendto() will wake up tx processing only. The main reason for introducing this new flag is to be able to efficiently support the case when application and driver is executing on the same core. Previously, the driver was just busy-spinning on the fill ring if it ran out of buffers in the HW and there were none on the fill ring. This approach works when the application is running on another core as it can replenish the fill ring while the driver is busy-spinning. Though, this is a lousy approach if both of them are running on the same core as the probability of the fill ring getting more entries when the driver is busy-spinning is zero. With this new feature the driver now sets the need_wakeup flag and returns to the application. The application can then replenish the fill queue and then explicitly wake up the Rx processing in the kernel using the syscall poll(). For Tx, the flag is only set to one if the driver has no outstanding Tx completion interrupts. If it has some, the flag is zero as it will be woken up by a completion interrupt anyway. As a nice side effect, this new flag also improves the performance of the case where application and driver are running on two different cores as it reduces the number of syscalls to the kernel. The kernel tells user space if it needs to be woken up by a syscall, and this eliminates many of the syscalls. This flag needs some simple driver support. If the driver does not support this, the Rx flag is always zero and the Tx flag is always one. This makes any application relying on this feature default to the old behaviour of not requiring any syscalls in the Rx path and always having to call sendto() in the Tx path. For backwards compatibility reasons, this feature has to be explicitly turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend that you always turn it on as it so far always have had a positive performance impact. The name and inspiration of the flag has been taken from io_uring by Jens Axboe. Details about this feature in io_uring can be found in http://kernel.dk/io_uring.pdf, section 8.3. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | | | | | | | | | | | | | | net_sched: act_police: add 2 new attributes to support police 64bit rate and ↵David Dai2019-09-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | peakrate For high speed adapter like Mellanox CX-5 card, it can reach upto 100 Gbits per second bandwidth. Currently htb already supports 64bit rate in tc utility. However police action rate and peakrate are still limited to 32bit value (upto 32 Gbits per second). Add 2 new attributes TCA_POLICE_RATE64 and TCA_POLICE_RATE64 in kernel for 64bit support so that tc utility can use them for 64bit rate and peakrate value to break the 32bit limit, and still keep the backward binary compatibility. Tested-by: David Dai <zdai@linux.vnet.ibm.com> Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | | | | net: openvswitch: Set OvS recirc_id from tc chain indexPaul Blakey2019-09-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Offloaded OvS datapath rules are translated one to one to tc rules, for example the following simplified OvS rule: recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2) Will be translated to the following tc rule: $ tc filter add dev dev1 ingress \ prio 1 chain 0 proto ip \ flower tcp ct_state -trk \ action ct pipe \ action goto chain 2 Received packets will first travel though tc, and if they aren't stolen by it, like in the above rule, they will continue to OvS datapath. Since we already did some actions (action ct in this case) which might modify the packets, and updated action stats, we would like to continue the proccessing with the correct recirc_id in OvS (here recirc_id(2)) where we left off. To support this, introduce a new skb extension for tc, which will be used for translating tc chain to ovs recirc_id to handle these miss cases. Last tc chain index will be set by tc goto chain action and read by OvS datapath. Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | | | | can: add support of SAE J1939 protocolThe j1939 authors2019-09-041-0/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SAE J1939 is the vehicle bus recommended practice used for communication and diagnostics among vehicle components. Originating in the car and heavy-duty truck industry in the United States, it is now widely used in other parts of the world. J1939, ISO 11783 and NMEA 2000 all share the same high level protocol. SAE J1939 can be considered the replacement for the older SAE J1708 and SAE J1587 specifications. Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Bastian Stender <bst@pengutronix.de> Signed-off-by: Elenita Hinds <ecathinds@gmail.com> Signed-off-by: kbuild test robot <lkp@intel.com> Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Maxime Jayat <maxime.jayat@mobile-devices.fr> Signed-off-by: Robin van der Gracht <robin@protonic.nl> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * | | | | | | | | | | | | | | | can: extend sockaddr_can to include j1939 membersKurt Van Dijck2019-09-041-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch prepares struct sockaddr_can for SAE J1939. Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * | | | | | | | | | | | | | | | can: add socket type for CAN_J1939Kurt Van Dijck2019-09-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a preparation for SAE J1939 and adds CAN_J1939 socket type. Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * | | | | | | | | | | | | | | | net: tls: export protocol version, cipher, tx_conf/rx_conf to socket diagDavide Caratti2019-08-312-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an application configures kernel TLS on top of a TCP socket, it's now possible for inet_diag_handler() to collect information regarding the protocol version, the cipher type and TX / RX configuration, in case INET_DIAG_INFO is requested. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | | | | | | | | | | tcp: ulp: add functions to dump ulp-specific informationDavide Caratti2019-08-311-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | currently, only getsockopt(TCP_ULP) can be invoked to know if a ULP is on top of a TCP socket. Extend idiag_get_aux() and idiag_get_aux_size(), introduced by commit b37e88407c1d ("inet_diag: allow protocols to provide additional data"), to report the ULP name and other information that can be made available by the ULP through optional functions. Users having CAP_NET_ADMIN privileges will then be able to retrieve this information through inet_diag_handler, if they specify INET_DIAG_INFO in the request. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud