summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * ALSA: hda/realtek: Enable headset MIC of Acer Aspire Z24-890 with ALC286Jian-Hong Pan2019-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2733ccebf4a937a0858e7d05a4a003b89715033f upstream. The Acer Aspire Z24-890 cannot detect the headset MIC until ALC286_FIXUP_ACER_AIO_HEADSET_MIC quirk applied. Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Daniel Drake <drake@endlessm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: hda/realtek: Enable headset MIC of Acer AIO with ALC286Jian-Hong Pan2019-04-031-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 667a8f73753908c4d0171e52b71774f9be5d6713 upstream. Some Acer AIO desktops like Veriton Z6860G, Z4860G and Z4660G cannot record sound from headset MIC. This patch adds the ALC286_FIXUP_ACER_AIO_HEADSET_MIC quirk to fix this issue. Fixes: 9f8aefed9623 ("ALSA: hda/realtek: Fix mic issue on Acer AIO Veriton Z4660G") Fixes: b72f936f6b32 ("ALSA: hda/realtek: Fix mic issue on Acer AIO Veriton Z4860G/Z6860G") Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Reviewed-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: hda/realtek - Add support headset mode for New DELL WYSE NBKailang Yang2019-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | commit da484d00f020af3dd7cfcc6c4b69a7f856832883 upstream. Enable headset mode support for new WYSE NB platform. Signed-off-by: Kailang Yang <kailang@realtek.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: hda/realtek - Add support headset mode for DELL WYSE AIOKailang Yang2019-04-031-0/+26
| | | | | | | | | | | | | | | | | | | | | | commit 136824efaab2c095fc911048f7c7ddeda258c965 upstream. This patch will enable WYSE AIO for Headset mode. Signed-off-by: Kailang Yang <kailang@realtek.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: hda/realtek: merge alc_fixup_headset_jack to alc295_fixup_chromebookJaroslav Kysela2019-04-031-30/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c8a9afa632f0fd45731d3353525faf1fdb362c89 upstream. The ALC225_FIXUP_HEADSET_JACK fixup can be merged to alc295_fixup_chromebook. There are no other users for ALC225_FIXUP_HEADSET_JACK other than the chromebook hardware. Fixes: 10f5b1b85ed1 ("ALSA: hda/realtek - Fixed Headset Mic JD not stable") Cc: Kailang Yang <kailang@realtek.com> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: hda/realtek - Fixed Headset Mic JD not stableKailang Yang2019-04-031-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 10f5b1b85ed10a80d45bc2db450e65bd792efaad upstream. It will be lose Mic JD state when Chrome OS boot and headset was plugged. Implement of reset combo jack JD. It will show normally. Fixes: e854747d7593 ("ALSA: hda/realtek - Enable headset button support for new codec") Signed-off-by: Kailang Yang <kailang@realtek.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: pcm: Don't suspend stream in unrecoverable PCM stateTakashi Iwai2019-04-031-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 113ce08109f8e3b091399e7cc32486df1cff48e7 upstream. Currently PCM core sets each opened stream forcibly to SUSPENDED state via snd_pcm_suspend_all() call, and the user-space is responsible for re-triggering the resume manually either via snd_pcm_resume() or prepare call. The scheme works fine usually, but there are corner cases where the stream can't be resumed by that call: the streams still in OPEN state before finishing hw_params. When they are suspended, user-space cannot perform resume or prepare because they haven't been set up yet. The only possible recovery is to re-open the device, which isn't nice at all. Similarly, when a stream is in DISCONNECTED state, it makes no sense to change it to SUSPENDED state. Ditto for in SETUP state; which you can re-prepare directly. So, this patch addresses these issues by filtering the PCM streams to be suspended by checking the PCM state. When a stream is in either OPEN, SETUP or DISCONNECTED as well as already SUSPENDED, the suspend action is skipped. To be noted, this problem was originally reported for the PCM runtime PM on HD-audio. And, the runtime PM problem itself was already addressed (although not intended) by the code refactoring commits 3d21ef0b49f8 ("ALSA: pcm: Suspend streams globally via device type PM ops") and 17bc4815de58 ("ALSA: pci: Remove superfluous snd_pcm_suspend*() calls"). These commits eliminated the snd_pcm_suspend*() calls from the runtime PM suspend callback code path, hence the racy OPEN state won't appear while runtime PM. (FWIW, the race window is between snd_pcm_open_substream() and the first power up in azx_pcm_open().) Although the runtime PM issue was already "fixed", the same problem is still present for the system PM, hence this patch is still needed. And for stable trees, this patch alone should suffice for fixing the runtime PM problem, too. Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: pcm: Fix possible OOB access in PCM oss pluginsTakashi Iwai2019-04-031-21/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ca0214ee2802dd47239a4e39fb21c5b00ef61b22 upstream. The PCM OSS emulation converts and transfers the data on the fly via "plugins". The data is converted over the dynamically allocated buffer for each plugin, and recently syzkaller caught OOB in this flow. Although the bisection by syzbot pointed out to the commit 65766ee0bf7f ("ALSA: oss: Use kvzalloc() for local buffer allocations"), this is merely a commit to replace vmalloc() with kvmalloc(), hence it can't be the cause. The further debug action revealed that this happens in the case where a slave PCM doesn't support only the stereo channels while the OSS stream is set up for a mono channel. Below is a brief explanation: At each OSS parameter change, the driver sets up the PCM hw_params again in snd_pcm_oss_change_params_lock(). This is also the place where plugins are created and local buffers are allocated. The problem is that the plugins are created before the final hw_params is determined. Namely, two snd_pcm_hw_param_near() calls for setting the period size and periods may influence on the final result of channels, rates, etc, too, while the current code has already created plugins beforehand with the premature values. So, the plugin believes that channels=1, while the actual I/O is with channels=2, which makes the driver reading/writing over the allocated buffer size. The fix is simply to move the plugin allocation code after the final hw_params call. Reported-by: syzbot+d4503ae45b65c5bc1194@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: seq: oss: Fix Spectre v1 vulnerabilityGustavo A. R. Silva2019-04-031-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c709f14f0616482b67f9fbcb965e1493a03ff30b upstream. dev is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: sound/core/seq/oss/seq_oss_synth.c:626 snd_seq_oss_synth_make_info() warn: potential spectre issue 'dp->synths' [w] (local cap) Fix this by sanitizing dev before using it to index dp->synths. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/ Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ALSA: rawmidi: Fix potential Spectre v1 vulnerabilityGustavo A. R. Silva2019-04-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2b1d9c8f87235f593826b9cf46ec10247741fff9 upstream. info->stream is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: sound/core/rawmidi.c:604 __snd_rawmidi_info_select() warn: potential spectre issue 'rmidi->streams' [r] (local cap) Fix this by sanitizing info->stream before using it to index rmidi->streams. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/ Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net: dsa: qca8k: remove leftover phy accessorsChristian Lamparter2019-04-031-18/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1eec7151ae0e134bd42e3f128066b2ff8da21393 upstream. This belated patch implements Andrew Lunn's request of "remove the phy_read() and phy_write() functions." <https://lore.kernel.org/patchwork/comment/902734/> While seemingly harmless, this causes the switch's user port PHYs to get registered twice. This is because the DSA subsystem will create a slave mdio-bus not knowing that the qca8k_phy_(read|write) accessors operate on the external mdio-bus. So the same "bus" gets effectively duplicated. Cc: stable@vger.kernel.org Fixes: 6b93fb46480a ("net-next: dsa: add new driver for qca8xxx family") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * NFSv4.1 don't free interrupted slot on openOlga Kornievskaia2019-04-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0cb98abb5bd13b9a636bde603d952d722688b428 upstream. Allow the async rpc task for finish and update the open state if needed, then free the slot. Otherwise, the async rpc unable to decode the reply. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Fixes: ae55e59da0e4 ("pnfs: Don't release the sequence slot...") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * NFS: fix mount/umount race in nlmclnt.NeilBrown2019-04-031-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4a9be28c45bf02fa0436808bb6c0baeba30e120e upstream. If the last NFSv3 unmount from a given host races with a mount from the same host, we can destroy an nlm_host that is still in use. Specifically nlmclnt_lookup_host() can increment h_count on an nlm_host that nlmclnt_release_host() has just successfully called refcount_dec_and_test() on. Once nlmclnt_lookup_host() drops the mutex, nlm_destroy_host_lock() will be called to destroy the nlmclnt which is now in use again. The cause of the problem is that the dec_and_test happens outside the locked region. This is easily fixed by using refcount_dec_and_mutex_lock(). Fixes: 8ea6ecc8b075 ("lockd: Create client-side nlm_host cache") Cc: stable@vger.kernel.org (v2.6.38+) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * NFS: Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()Catalin Marinas2019-04-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3028efe03be9c8c4cd7923f0f3c39b2871cc8a8f upstream. Commit 7b587e1a5a6c ("NFS: use locks_copy_lock() to copy locks.") changed the lock copying from memcpy() to the dedicated locks_copy_lock() function. The latter correctly increments the nfs4_lock_state.ls_count via nfs4_fl_copy_lock(), however, this refcount has already been incremented in the nfs4_alloc_{lock,unlock}data(). Kmemleak subsequently reports an unreferenced nfs4_lock_state object as below (arm64 platform): unreferenced object 0xffff8000fce0b000 (size 256): comm "systemd-sysuser", pid 1608, jiffies 4294892825 (age 32.348s) hex dump (first 32 bytes): 20 57 4c fb 00 80 ff ff 20 57 4c fb 00 80 ff ff WL..... WL..... 00 57 4c fb 00 80 ff ff 01 00 00 00 00 00 00 00 .WL............. backtrace: [<000000000d15010d>] kmem_cache_alloc+0x178/0x208 [<00000000d7c1d264>] nfs4_set_lock_state+0x124/0x1f0 [<000000009c867628>] nfs4_proc_lock+0x90/0x478 [<000000001686bd74>] do_setlk+0x64/0xe8 [<00000000e01500d4>] nfs_lock+0xe8/0x1f0 [<000000004f387d8d>] vfs_lock_file+0x18/0x40 [<00000000656ab79b>] do_lock_file_wait+0x68/0xf8 [<00000000f17c4a4b>] fcntl_setlk+0x224/0x280 [<0000000052a242c6>] do_fcntl+0x418/0x730 [<000000004f47291a>] __arm64_sys_fcntl+0x84/0xd0 [<00000000d6856e01>] el0_svc_common+0x80/0xf0 [<000000009c4bd1df>] el0_svc_handler+0x2c/0x80 [<00000000b1a0d479>] el0_svc+0x8/0xc [<0000000056c62a0f>] 0xffffffffffffffff This patch removes the original refcount_inc(&lsp->ls_count) that was paired with the memcpy() lock copying. Fixes: 7b587e1a5a6c ("NFS: use locks_copy_lock() to copy locks.") Cc: <stable@vger.kernel.org> # 5.0.x- Cc: NeilBrown <neilb@suse.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vfio: ccw: only free cp on final interruptCornelia Huck2019-04-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 50b7f1b7236bab08ebbbecf90521e84b068d7a17 upstream. When we get an interrupt for a channel program, it is not necessarily the final interrupt; for example, the issuing guest may request an intermediate interrupt by specifying the program-controlled-interrupt flag on a ccw. We must not switch the state to idle if the interrupt is not yet final; even more importantly, we must not free the translated channel program if the interrupt is not yet final, or the host can crash during cp rewind. Fixes: e5f84dbaea59 ("vfio: ccw: return I/O results asynchronously") Cc: stable@vger.kernel.org # v4.12+ Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * powerpc: bpf: Fix generation of load/store DW instructionsNaveen N. Rao2019-04-035-18/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 86be36f6502c52ddb4b85938145324fd07332da1 upstream. Yauheni Kaliuta pointed out that PTR_TO_STACK store/load verifier test was failing on powerpc64 BE, and rightfully indicated that the PPC_LD() macro is not masking away the last two bits of the offset per the ISA, resulting in the generation of 'lwa' instruction instead of the intended 'ld' instruction. Segher also pointed out that we can't simply mask away the last two bits as that will result in loading/storing from/to a memory location that was not intended. This patch addresses this by using ldx/stdx if the offset is not word-aligned. We load the offset into a temporary register (TMP_REG_2) and use that as the index register in a subsequent ldx/stdx. We fix PPC_LD() macro to mask off the last two bits, but enhance PPC_BPF_LL() and PPC_BPF_STL() to factor in the offset value and generate the proper instruction sequence. We also convert all existing users of PPC_LD() and PPC_STD() to use these macros. All existing uses of these macros have been audited to ensure that TMP_REG_2 can be clobbered. Fixes: 156d0e290e96 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Cc: stable@vger.kernel.org # v4.9+ Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ARM: imx6q: cpuidle: fix bug that CPU might not wake up at expected timeKohji Okuno2019-04-031-17/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 91740fc8242b4f260cfa4d4536d8551804777fae upstream. In the current cpuidle implementation for i.MX6q, the CPU that sets 'WAIT_UNCLOCKED' and the CPU that returns to 'WAIT_CLOCKED' are always the same. While the CPU that sets 'WAIT_UNCLOCKED' is in IDLE state of "WAIT", if the other CPU wakes up and enters IDLE state of "WFI" istead of "WAIT", this CPU can not wake up at expired time. Because, in the case of "WFI", the CPU must be waked up by the local timer interrupt. But, while 'WAIT_UNCLOCKED' is set, the local timer is stopped, when all CPUs execute "wfi" instruction. As a result, the local timer interrupt is not fired. In this situation, this CPU will wake up by IRQ different from local timer. (e.g. broacast timer) So, this fix changes CPU to return to 'WAIT_CLOCKED'. Signed-off-by: Kohji Okuno <okuno.kohji@jp.panasonic.com> Fixes: e5f9dec8ff5f ("ARM: imx6q: support WAIT mode using cpuidle") Cc: <stable@vger.kernel.org> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tracing: initialize variable in create_dyn_event()Frank Rowand2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3dee10da2e9ff220e054a8f158cc296c797fbe81 upstream. Fix compile warning in create_dyn_event(): 'ret' may be used uninitialized in this function [-Wuninitialized]. Link: http://lkml.kernel.org/r/1553237900-8555-1-git-send-email-frowand.list@gmail.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Fixes: 5448d44c3855 ("tracing: Add unified dynamic event framework") Signed-off-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * locks: wake any locks blocked on request before deadlock checkJeff Layton2019-04-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 945ab8f6de94430c23a82f3cf2e3f6d6f2945ff7 upstream. Andreas reported that he was seeing the tdbtorture test fail in some cases with -EDEADLCK when it wasn't before. Some debugging showed that deadlock detection was sometimes discovering the caller's lock request itself in a dependency chain. While we remove the request from the blocked_lock_hash prior to reattempting to acquire it, any locks that are blocked on that request will still be present in the hash and will still have their fl_blocker pointer set to the current request. This causes posix_locks_deadlock to find a deadlock dependency chain when it shouldn't, as a lock request cannot block itself. We are going to end up waking all of those blocked locks anyway when we go to reinsert the request back into the blocked_lock_hash, so just do it prior to checking for deadlocks. This ensures that any lock blocked on the current request will no longer be part of any blocked request chain. URL: https://bugzilla.kernel.org/show_bug.cgi?id=202975 Fixes: 5946c4319ebb ("fs/locks: allow a lock request to block other requests.") Cc: stable@vger.kernel.org Reported-by: Andreas Schneider <asn@redhat.com> Signed-off-by: Neil Brown <neilb@suse.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Btrfs: fix assertion failure on fsync with NO_HOLES enabledFilipe Manana2019-04-031-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0ccc3876e4b2a1559a4dbe3126dda4459d38a83b upstream. Back in commit a89ca6f24ffe4 ("Btrfs: fix fsync after truncate when no_holes feature is enabled") I added an assertion that is triggered when an inline extent is found to assert that the length of the (uncompressed) data the extent represents is the same as the i_size of the inode, since that is true most of the time I couldn't find or didn't remembered about any exception at that time. Later on the assertion was expanded twice to deal with a case of a compressed inline extent representing a range that matches the sector size followed by an expanding truncate, and another case where fallocate can update the i_size of the inode without adding or updating existing extents (if the fallocate range falls entirely within the first block of the file). These two expansion/fixes of the assertion were done by commit 7ed586d0a8241 ("Btrfs: fix assertion on fsync of regular file when using no-holes feature") and commit 6399fb5a0b69a ("Btrfs: fix assertion failure during fsync in no-holes mode"). These however missed the case where an falloc expands the i_size of an inode to exactly the sector size and inline extent exists, for example: $ mkfs.btrfs -f -O no-holes /dev/sdc $ mount /dev/sdc /mnt $ xfs_io -f -c "pwrite -S 0xab 0 1096" /mnt/foobar wrote 1096/1096 bytes at offset 0 1 KiB, 1 ops; 0.0002 sec (4.448 MiB/sec and 4255.3191 ops/sec) $ xfs_io -c "falloc 1096 3000" /mnt/foobar $ xfs_io -c "fsync" /mnt/foobar Segmentation fault $ dmesg [701253.602385] assertion failed: len == i_size || (len == fs_info->sectorsize && btrfs_file_extent_compression(leaf, extent) != BTRFS_COMPRESS_NONE) || (len < i_size && i_size < fs_info->sectorsize), file: fs/btrfs/tree-log.c, line: 4727 [701253.602962] ------------[ cut here ]------------ [701253.603224] kernel BUG at fs/btrfs/ctree.h:3533! [701253.603503] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI [701253.603774] CPU: 2 PID: 7192 Comm: xfs_io Tainted: G W 5.0.0-rc8-btrfs-next-45 #1 [701253.604054] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [701253.604650] RIP: 0010:assfail.constprop.23+0x18/0x1a [btrfs] (...) [701253.605591] RSP: 0018:ffffbb48c186bc48 EFLAGS: 00010286 [701253.605914] RAX: 00000000000000de RBX: ffff921d0a7afc08 RCX: 0000000000000000 [701253.606244] RDX: 0000000000000000 RSI: ffff921d36b16868 RDI: ffff921d36b16868 [701253.606580] RBP: ffffbb48c186bcf0 R08: 0000000000000000 R09: 0000000000000000 [701253.606913] R10: 0000000000000003 R11: 0000000000000000 R12: ffff921d05d2de18 [701253.607247] R13: ffff921d03b54000 R14: 0000000000000448 R15: ffff921d059ecf80 [701253.607769] FS: 00007f14da906700(0000) GS:ffff921d36b00000(0000) knlGS:0000000000000000 [701253.608163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [701253.608516] CR2: 000056087ea9f278 CR3: 00000002268e8001 CR4: 00000000003606e0 [701253.608880] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [701253.609250] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [701253.609608] Call Trace: [701253.609994] btrfs_log_inode+0xdfb/0xe40 [btrfs] [701253.610383] btrfs_log_inode_parent+0x2be/0xa60 [btrfs] [701253.610770] ? do_raw_spin_unlock+0x49/0xc0 [701253.611150] btrfs_log_dentry_safe+0x4a/0x70 [btrfs] [701253.611537] btrfs_sync_file+0x3b2/0x440 [btrfs] [701253.612010] ? do_sysinfo+0xb0/0xf0 [701253.612552] do_fsync+0x38/0x60 [701253.612988] __x64_sys_fsync+0x10/0x20 [701253.613360] do_syscall_64+0x60/0x1b0 [701253.613733] entry_SYSCALL_64_after_hwframe+0x49/0xbe [701253.614103] RIP: 0033:0x7f14da4e66d0 (...) [701253.615250] RSP: 002b:00007fffa670fdb8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a [701253.615647] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f14da4e66d0 [701253.616047] RDX: 000056087ea9c260 RSI: 000056087ea9c260 RDI: 0000000000000003 [701253.616450] RBP: 0000000000000001 R08: 0000000000000020 R09: 0000000000000010 [701253.616854] R10: 000000000000009b R11: 0000000000000246 R12: 000056087ea9c260 [701253.617257] R13: 000056087ea9c240 R14: 0000000000000000 R15: 000056087ea9dd10 (...) [701253.619941] ---[ end trace e088d74f132b6da5 ]--- Updating the assertion again to allow for this particular case would result in a meaningless assertion, plus there is currently no risk of logging content that would result in any corruption after a log replay if the size of the data encoded in an inline extent is greater than the inode's i_size (which is not currently possibe either with or without compression), therefore just remove the assertion. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * btrfs: Avoid possible qgroup_rsv_size overflow in ↵Nikolay Borisov2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_calculate_inode_block_rsv_size commit 139a56170de67101791d6e6c8e940c6328393fe9 upstream. qgroup_rsv_size is calculated as the product of outstanding_extent * fs_info->nodesize. The product is calculated with 32 bit precision since both variables are defined as u32. Yet qgroup_rsv_size expects a 64 bit result. Avoid possible multiplication overflow by casting outstanding_extent to u64. Such overflow would in the worst case (64K nodesize) require more than 65536 extents, which is quite large and i'ts not likely that it would happen in practice. Fixes-coverity-id: 1435101 Fixes: ff6bc37eb7f6 ("btrfs: qgroup: Use independent and accurate per inode qgroup rsv") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * btrfs: Fix bound checking in qgroup_trace_new_subtree_blocksNikolay Borisov2019-04-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7ff2c2a1a71e83f74574b8001ea88deb3c166ad7 upstream. If 'cur_level' is 7 then the bound checking at the top of the function will actually pass. Later on, it's possible to dereference ds_path->nodes[cur_level+1] which will be an out of bounds. The correct check will be cur_level >= BTRFS_MAX_LEVEL - 1 . Fixes-coverty-id: 1440918 Fixes-coverty-id: 1440911 Fixes: ea49f3e73c4b ("btrfs: qgroup: Introduce function to find all new tree blocks of reloc tree") CC: stable@vger.kernel.org # 4.20+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * btrfs: raid56: properly unmap parity page in finish_parity_scrub()Andrea Righi2019-04-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3897b6f0a859288c22fb793fad11ec2327e60fcd upstream. Parity page is incorrectly unmapped in finish_parity_scrub(), triggering a reference counter bug on i386, i.e.: [ 157.662401] kernel BUG at mm/highmem.c:349! [ 157.666725] invalid opcode: 0000 [#1] SMP PTI The reason is that kunmap(p_page) was completely left out, so we never did an unmap for the p_page and the loop unmapping the rbio page was iterating over the wrong number of stripes: unmapping should be done with nr_data instead of rbio->real_stripes. Test case to reproduce the bug: - create a raid5 btrfs filesystem: # mkfs.btrfs -m raid5 -d raid5 /dev/sdb /dev/sdc /dev/sdd /dev/sde - mount it: # mount /dev/sdb /mnt - run btrfs scrub in a loop: # while :; do btrfs scrub start -BR /mnt; done BugLink: https://bugs.launchpad.net/bugs/1812845 Fixes: 5a6ac9eacb49 ("Btrfs, raid56: support parity scrub on raid56") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * btrfs: don't report readahead errors and don't update statisticsDavid Sterba2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0cc068e6ee59c1fffbfa977d8bf868b7551d80ac upstream. As readahead is an optimization, all errors are usually filtered out, but still properly handled when the real read call is done. The commit 5e9d398240b2 ("btrfs: readpages() should submit IO as read-ahead") added REQ_RAHEAD to readpages() because that's only used for readahead (despite what one would expect from the callback name). This causes a flood of messages and inflated read error stats, so skip reporting in case it's readahead. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202403 Reported-by: LimeTech <tomm@lime-technology.com> Fixes: 5e9d398240b2 ("btrfs: readpages() should submit IO as read-ahead") CC: stable@vger.kernel.org # 4.19+ Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * btrfs: remove WARN_ON in log_dir_itemsJosef Bacik2019-04-031-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2cc8334270e281815c3850c3adea363c51f21e0d upstream. When Filipe added the recursive directory logging stuff in 2f2ff0ee5e430 ("Btrfs: fix metadata inconsistencies after directory fsync") he specifically didn't take the directory i_mutex for the children directories that we need to log because of lockdep. This is generally fine, but can lead to this WARN_ON() tripping if we happen to run delayed deletion's in between our first search and our second search of dir_item/dir_indexes for this directory. We expect this to happen, so the WARN_ON() isn't necessary. Drop the WARN_ON() and add a comment so we know why this case can happen. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Btrfs: fix incorrect file size after shrinking truncate and fsyncFilipe Manana2019-04-031-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit bf504110bc8aa05df48b0e5f0aa84bfb81e0574b upstream. If we do a shrinking truncate against an inode which is already present in the respective log tree and then rename it, as part of logging the new name we end up logging an inode item that reflects the old size of the file (the one which we previously logged) and not the new smaller size. The decision to preserve the size previously logged was added by commit 1a4bcf470c886b ("Btrfs: fix fsync data loss after adding hard link to inode") in order to avoid data loss after replaying the log. However that decision is only needed for the case the logged inode size is smaller then the current size of the inode, as explained in that commit's change log. If the current size of the inode is smaller then the previously logged size, we know a shrinking truncate happened and therefore need to use that smaller size. Example to trigger the problem: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0xab 0 8000" /mnt/foo $ xfs_io -c "fsync" /mnt/foo $ xfs_io -c "truncate 3000" /mnt/foo $ mv /mnt/foo /mnt/bar $ xfs_io -c "fsync" /mnt/bar <power failure> $ mount /dev/sdb /mnt $ od -t x1 -A d /mnt/bar 0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab * 0008000 Once we rename the file, we log its name (and inode item), and because the inode was already logged before in the current transaction, we log it with a size of 8000 bytes because that is the size we previously logged (with the first fsync). As part of the rename, besides logging the inode, we do also sync the log, which is done since commit d4682ba03ef618 ("Btrfs: sync log after logging new name"), so the next fsync against our inode is effectively a no-op, since no new changes happened since the rename operation. Even if did not sync the log during the rename operation, the same problem (fize size of 8000 bytes instead of 3000 bytes) would be visible after replaying the log if the log ended up getting synced to disk through some other means, such as for example by fsyncing some other modified file. In the example above the fsync after the rename operation is there just because not every filesystem may guarantee logging/journalling the inode (and syncing the log/journal) during the rename operation, for example it is needed for f2fs, but not for ext4 and xfs. Fix this scenario by, when logging a new name (which is triggered by rename and link operations), using the current size of the inode instead of the previously logged inode size. A test case for fstests follows soon. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202695 CC: stable@vger.kernel.org # 4.4+ Reported-by: Seulbae Kim <seulbae@gatech.edu> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * powerpc/fsl: Fix the flush of branch predictor.Christophe Leroy2019-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 27da80719ef132cf8c80eb406d5aeb37dddf78cc upstream. The commit identified below adds MC_BTB_FLUSH macro only when CONFIG_PPC_FSL_BOOK3E is defined. This results in the following error on some configs (seen several times with kisskb randconfig_defconfig) arch/powerpc/kernel/exceptions-64e.S:576: Error: Unrecognized opcode: `mc_btb_flush' make[3]: *** [scripts/Makefile.build:367: arch/powerpc/kernel/exceptions-64e.o] Error 1 make[2]: *** [scripts/Makefile.build:492: arch/powerpc/kernel] Error 2 make[1]: *** [Makefile:1043: arch/powerpc] Error 2 make: *** [Makefile:152: sub-make] Error 2 This patch adds a blank definition of MC_BTB_FLUSH for other cases. Fixes: 10c5e83afd4a ("powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)") Cc: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tun: add a missing rcu_read_unlock() in error pathEric Dumazet2019-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9180bb4f046064dfa4541488102703b402bb04e1 upstream. In my latest patch I missed one rcu_read_unlock(), in case device is down. Fixes: 4477138fa0ae ("tun: properly test for IFF_UP") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * ila: Fix rhashtable walker list corruptionHerbert Xu2019-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit b5f9bd15b88563b55a99ed588416881367a0ce5f ] ila_xlat_nl_cmd_flush uses rhashtable walkers allocated from the stack but it never frees them. This corrupts the walker list of the hash table. This patch fixes it. Reported-by: syzbot+dae72a112334aa65a159@syzkaller.appspotmail.com Fixes: b6e71bdebb12 ("ila: Flush netlink command to clear xlat...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * r8169: fix cable re-plugging issueHeiner Kallweit2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 23c78343ec36990709b636a9e02bad814f4384ad ] Bartek reported that after few cable unplug/replug cycles suddenly replug isn't detected any longer. His system uses a RTL8106, I wasn't able to reproduce the issue with RTL8168g. According to his bisect the referenced commit caused the regression. As Realtek doesn't release datasheets or errata it's hard to say what's the actual root cause, but this change was reported to fix the issue. Fixes: 38caff5a445b ("r8169: handle all interrupt events in the hard irq handler") Reported-by: Bartosz Skrzypczak <barteks2x@gmail.com> Suggested-by: Bartosz Skrzypczak <barteks2x@gmail.com> Tested-by: Bartosz Skrzypczak <barteks2x@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net: phy: don't clear BMCR in genphy_soft_resetHeiner Kallweit2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d29f5aa0bc0c321e1b9e4658a2a7e08e885da52a ] So far we effectively clear the BMCR register. Some PHY's can deal with this (e.g. because they reset BMCR to a default as part of a soft-reset) whilst on others this causes issues because e.g. the autoneg bit is cleared. Marvell is an example, see also thread [0]. So let's be a little bit more gentle and leave all bits we're not interested in as-is. This change is needed for PHY drivers to properly deal with the original patch. [0] https://marc.info/?t=155264050700001&r=1&w=2 Fixes: 6e2d85ec0559 ("net: phy: Stop with excessive soft reset") Tested-by: Phil Reid <preid@electromag.com.au> Tested-by: liweihang <liweihang@hisilicon.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net: mii: Fix PAUSE cap advertisement from linkmode_adv_to_lcl_adv_t() helperClaudiu Manoil2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 7f07e5f1f778605e98cf2156d4db1ff3a3a1a74a ] With a recent link mode advertisement code update this helper providing local pause capability translation used for flow control link mode negotiation got broken. For eth drivers using this helper, the issue is apparent only if either PAUSE or ASYM_PAUSE is being advertised. Fixes: 3c1bcc8614db ("net: ethernet: Convert phydev advertize and supported from u32 to link mode") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net: dsa: mv88e6xxx: fix few issues in mv88e6390x_port_set_cmodeHeiner Kallweit2019-04-031-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 5ceaeb99ffb4dc002d20f6ac243c19a85e2c7a76 ] This patches fixes few issues in mv88e6390x_port_set_cmode(). 1. When entering the function the old cmode may be 0, in this case mv88e6390x_serdes_get_lane() returns -ENODEV. As result we bail out and have no chance to set a new mode. Therefore deal properly with -ENODEV. 2. Once we have disabled power and irq, let's set the cached cmode to 0. This reflects the actual status and is cleaner if we bail out with an error in the following function calls. 3. The cached cmode is used by mv88e6390x_serdes_get_lane(), mv88e6390_serdes_power_lane() and mv88e6390_serdes_irq_enable(). Currently we set the cached mode to the new one at the very end of the function only, means until then we use the old one what may be wrong. 4. When calling mv88e6390_serdes_irq_enable() we use the lane value belonging to the old cmode. Get the lane belonging to the new cmode before calling this function. It's hard to provide a good "Fixes" tag because quite a few smaller changes have been done to the code in question recently. Fixes: d235c48b40d3 ("net: dsa: mv88e6xxx: power serdes on/off for 10G interfaces on 6390X") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * thunderx: eliminate extra calls to put_page() for pages held for recyclingDean Nelson2019-04-031-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cd35ef91490ad8049dd180bb060aff7ee192eda9 ] For the non-XDP case, commit 773225388dae15e72790 ("net: thunderx: Optimize page recycling for XDP") added code to nicvf_free_rbdr() that, when releasing the additional receive buffer page reference held for recycling, repeatedly calls put_page() until the page's _refcount goes to zero. Which results in the page being freed. This is not okay if the page's _refcount was greater than 1 (in the non-XDP case), because nicvf_free_rbdr() should not be subtracting more than what nicvf_alloc_page() had previously added to the page's _refcount, which was only 1 (in the non-XDP case). This can arise if a received packet is still being processed and the receive buffer (i.e., skb->head) has not yet been freed via skb_free_head() when nicvf_free_rbdr() is spinning through the aforementioned put_page() loop. If this should occur, when the received packet finishes processing and skb_free_head() is called, various problems can ensue. Exactly what, depends on whether the page has already been reallocated or not, anything from "BUG: Bad page state ... ", to "Unable to handle kernel NULL pointer dereference ..." or "Unable to handle kernel paging request...". So this patch changes nicvf_free_rbdr() to only call put_page() once for pages held for recycling (in the non-XDP case). Fixes: 773225388dae ("net: thunderx: Optimize page recycling for XDP") Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * thunderx: enable page recycling for non-XDP caseDean Nelson2019-04-031-12/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit b3e208069477588c06f4d5d986164b435bb06e6d ] Commit 773225388dae15e72790 ("net: thunderx: Optimize page recycling for XDP") added code to nicvf_alloc_page() that inadvertently disables receive buffer page recycling for the non-XDP case by always NULL'ng the page pointer. This patch corrects two if-conditionals to allow for the recycling of non-XDP mode pages by only setting the page pointer to NULL when the page is not ready for recycling. Fixes: 773225388dae ("net: thunderx: Optimize page recycling for XDP") Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vxlan: Don't call gro_cells_destroy() before device is unregisteredZhiqiang Liu2019-04-031-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cc4807bb609230d8959fd732b0bf3bd4c2de8eac ] Commit ad6c9986bcb62 ("vxlan: Fix GRO cells race condition between receive and link delete") fixed a race condition for the typical case a vxlan device is dismantled from the current netns. But if a netns is dismantled, vxlan_destroy_tunnels() is called to schedule a unregister_netdevice_queue() of all the vxlan tunnels that are related to this netns. In vxlan_destroy_tunnels(), gro_cells_destroy() is called and finished before unregister_netdevice_queue(). This means that the gro_cells_destroy() call is done too soon, for the same reasons explained in above commit. So we need to fully respect the RCU rules, and thus must remove the gro_cells_destroy() call or risk use after-free. Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer") Signed-off-by: Suanming.Mou <mousuanming@huawei.com> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vrf: prevent adding upper devicesSabrina Dubroca2019-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 1017e0987117c32783ba7c10fe2e7ff1456ba1dc ] VRF devices don't work with upper devices. Currently, it's possible to add a VRF device to a bridge or team, and to create macvlan, macsec, or ipvlan devices on top of a VRF (bond and vlan are prevented respectively by the lack of an ndo_set_mac_address op and the NETIF_F_VLAN_CHALLENGED feature flag). Fix this by setting the IFF_NO_RX_HANDLER flag (introduced in commit f5426250a6ec ("net: introduce IFF_NO_RX_HANDLER")). Cc: David Ahern <dsahern@gmail.com> Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tun: properly test for IFF_UPEric Dumazet2019-04-031-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 4477138fa0ae4e1b699786ef0600863ea6e6c61c ] Same reasons than the ones explained in commit 4179cb5a4c92 ("vxlan: test dev->flags & IFF_UP before calling netif_rx()") netif_rx_ni() or napi_gro_frags() must be called under a strict contract. At device dismantle phase, core networking clears IFF_UP and flush_all_backlogs() is called after rcu grace period to make sure no incoming packet might be in a cpu backlog and still referencing the device. A similar protocol is used for gro layer. Most drivers call netif_rx() from their interrupt handler, and since the interrupts are disabled at device dismantle, netif_rx() does not have to check dev->flags & IFF_UP Virtual drivers do not have this guarantee, and must therefore make the check themselves. Fixes: 1bd4978a88ac ("tun: honor IFF_UP in tun_get_user()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tipc: fix cancellation of topology subscriptionsErik Hugne2019-04-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 33872d79f5d1cbedaaab79669cc38f16097a9450 ] When cancelling a subscription, we have to clear the cancel bit in the request before iterating over any established subscriptions with memcmp. Otherwise no subscription will ever be found, and it will not be possible to explicitly unsubscribe individual subscriptions. Fixes: 8985ecc7c1e0 ("tipc: simplify endianness handling in topology subscriber") Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tipc: change to check tipc_own_id to return in tipc_net_stopXin Long2019-04-031-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 9926cb5f8b0f0aea535735185600d74db7608550 ] When running a syz script, a panic occurred: [ 156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.094315] Call Trace: [ 156.094844] <IRQ> [ 156.095306] dump_stack+0x7c/0xc0 [ 156.097346] print_address_description+0x65/0x22e [ 156.100445] kasan_report.cold.3+0x37/0x7a [ 156.102402] tipc_disc_timeout+0x9c9/0xb20 [tipc] [ 156.106517] call_timer_fn+0x19a/0x610 [ 156.112749] run_timer_softirq+0xb51/0x1090 It was caused by the netns freed without deleting the discoverer timer, while later on the netns would be accessed in the timer handler. The timer should have been deleted by tipc_net_stop() when cleaning up a netns. However, tipc has been able to enable a bearer and start d->timer without the local node_addr set since Commit 52dfae5c85a4 ("tipc: obtain node identity from interface by default"), which caused the timer not to be deleted in tipc_net_stop() then. So fix it in tipc_net_stop() by changing to check local node_id instead of local node_addr, as Jon suggested. While at it, remove the calling of tipc_nametbl_withdraw() there, since tipc_nametbl_stop() will take of the nametbl's freeing after. Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default") Reported-by: syzbot+a25307ad099309f1c2b9@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tipc: allow service ranges to be connect()'ed on RDM/DGRAMErik Hugne2019-04-031-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit ea239314fe42ace880bdd834256834679346c80e ] We move the check that prevents connecting service ranges to after the RDM/DGRAM check, and move address sanity control to a separate function that also validates the service range. Fixes: 23998835be98 ("tipc: improve address sanity check in tipc_connect()") Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * tcp: do not use ipv6 header for ipv4 flowEric Dumazet2019-04-031-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 89e4130939a20304f4059ab72179da81f5347528 ] When a dual stack tcp listener accepts an ipv4 flow, it should not attempt to use an ipv6 header or tcp_v6_iif() helper. Fixes: 1397ed35f22d ("ipv6: add flowinfo for tcp6 pkt_options for all cases") Fixes: df3687ffc665 ("ipv6: add the IPV6_FL_F_REFLECT flag to IPV6_FL_A_GET") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * sctp: use memdup_user instead of vmemdup_userXin Long2019-04-031-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit ef82bcfa671b9a635bab5fa669005663d8b177c5 ] In sctp_setsockopt_bindx()/__sctp_setsockopt_connectx(), it allocates memory with addrs_size which is passed from userspace. We used flag GFP_USER to put some more restrictions on it in Commit cacc06215271 ("sctp: use GFP_USER for user-controlled kmalloc"). However, since Commit c981f254cc82 ("sctp: use vmemdup_user() rather than badly open-coding memdup_user()"), vmemdup_user() has been used, which doesn't check GFP_USER flag when goes to vmalloc_*(). So when addrs_size is a huge value, it could exhaust memory and even trigger oom killer. This patch is to use memdup_user() instead, in which GFP_USER would work to limit the memory allocation with a huge addrs_size. Note we can't fix it by limiting 'addrs_size', as there's no demand for it from RFC. Reported-by: syzbot+ec1b7575afef85a0e5ca@syzkaller.appspotmail.com Fixes: c981f254cc82 ("sctp: use vmemdup_user() rather than badly open-coding memdup_user()") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * sctp: get sctphdr by offset in sctp_compute_cksumXin Long2019-04-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 273160ffc6b993c7c91627f5a84799c66dfe4dee ] sctp_hdr(skb) only works when skb->transport_header is set properly. But in Netfilter, skb->transport_header for ipv6 is not guaranteed to be right value for sctphdr. It would cause to fail to check the checksum for sctp packets. So fix it by using offset, which is always right in all places. v1->v2: - Fix the changelog. Fixes: e6d8b64b34aa ("net: sctp: fix and consolidate SCTP checksumming code") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * rhashtable: Still do rehash when we get EEXISTHerbert Xu2019-04-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 408f13ef358aa5ad56dc6230c2c7deb92cf462b1 ] As it stands if a shrink is delayed because of an outstanding rehash, we will go into a rescheduling loop without ever doing the rehash. This patch fixes this by still carrying out the rehash and then rescheduling so that we can shrink after the completion of the rehash should it still be necessary. The return value of EEXIST captures this case and other cases (e.g., another thread expanded/rehashed the table at the same time) where we should still proceed with the rehash. Fixes: da20420f83ea ("rhashtable: Add nested tables") Reported-by: Josh Elsasser <jelsasser@appneta.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Josh Elsasser <jelsasser@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * packets: Always register packet sk in the same orderMaxime Chevallier2019-04-032-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit a4dc6a49156b1f8d6e17251ffda17c9e6a5db78a ] When using fanouts with AF_PACKET, the demux functions such as fanout_demux_cpu will return an index in the fanout socket array, which corresponds to the selected socket. The ordering of this array depends on the order the sockets were added to a given fanout group, so for FANOUT_CPU this means sockets are bound to cpus in the order they are configured, which is OK. However, when stopping then restarting the interface these sockets are bound to, the sockets are reassigned to the fanout group in the reverse order, due to the fact that they were inserted at the head of the interface's AF_PACKET socket list. This means that traffic that was directed to the first socket in the fanout group is now directed to the last one after an interface restart. In the case of FANOUT_CPU, traffic from CPU0 will be directed to the socket that used to receive traffic from the last CPU after an interface restart. This commit introduces a helper to add a socket at the tail of a list, then uses it to register AF_PACKET sockets. Note that this changes the order in which sockets are listed in /proc and with sock_diag. Fixes: dc99f600698d ("packet: Add fanout support") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net: usb: aqc111: Extend HWID table by QNAP deviceDmitry Bezrukov2019-04-032-0/+23
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit b7ebee2f95fb0fa2862d5ed2de707f872c311393 ] New device of QNAP based on aqc111u Add this ID to blacklist of cdc_ether driver as well Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net-sysfs: call dev_hold if kobject_init_and_add successYueHaibing2019-04-031-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit a3e23f719f5c4a38ffb3d30c8d7632a4ed8ccd9e ] In netdev_queue_add_kobject and rx_queue_add_kobject, if sysfs_create_group failed, kobject_put will call netdev_queue_release to decrease dev refcont, however dev_hold has not be called. So we will see this while unregistering dev: unregister_netdevice: waiting for bcsh0 to become free. Usage count = -1 Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: d0d668371679 ("net: don't decrement kobj reference count on init failure") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net: stmmac: fix memory corruption with large MTUsAaro Koskinen2019-04-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 223a960c01227e4dbcb6f9fa06b47d73bda21274 ] When using 16K DMA buffers and ring mode, the DES3 refill is not working correctly as the function is using a bogus pointer for checking the private data. As a result stale pointers will remain in the RX descriptor ring, so DMA will now likely overwrite/corrupt some already freed memory. As simple reproducer, just receive some UDP traffic: # ifconfig eth0 down; ifconfig eth0 mtu 9000; ifconfig eth0 up # iperf3 -c 192.168.253.40 -u -b 0 -R If you didn't crash by now check the RX descriptors to find non-contiguous RX buffers: cat /sys/kernel/debug/stmmaceth/eth0/descriptors_status [...] 1 [0x2be5020]: 0xa3220321 0x9ffc1ffc 0x72d70082 0x130e207e ^^^^^^^^^^^^^^^^^^^^^ 2 [0x2be5040]: 0xa3220321 0x9ffc1ffc 0x72998082 0x1311a07e ^^^^^^^^^^^^^^^^^^^^^ A simple ping test will now report bad data: # ping -s 8200 192.168.253.40 PING 192.168.253.40 (192.168.253.40) 8200(8228) bytes of data. 8208 bytes from 192.168.253.40: icmp_seq=1 ttl=64 time=1.00 ms wrong data byte #8144 should be 0xd0 but was 0x88 Fix the wrong pointer. Also we must refill DES3 only if the DMA buffer size is 16K. Fixes: 54139cf3bb33 ("net: stmmac: adding multiple buffers for rx") Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com> Acked-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * net: rose: fix a possible stack overflowEric Dumazet2019-04-031-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit e5dcc0c3223c45c94100f05f28d8ef814db3d82c ] rose_write_internal() uses a temp buffer of 100 bytes, but a manual inspection showed that given arbitrary input, rose_create_facilities() can fill up to 110 bytes. Lets use a tailroom of 256 bytes for peace of mind, and remove the bounce buffer : we can simply allocate a big enough skb and adjust its length as needed. syzbot report : BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:352 [inline] BUG: KASAN: stack-out-of-bounds in rose_create_facilities net/rose/rose_subr.c:521 [inline] BUG: KASAN: stack-out-of-bounds in rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116 Write of size 7 at addr ffff88808b1ffbef by task syz-executor.0/24854 CPU: 0 PID: 24854 Comm: syz-executor.0 Not tainted 5.0.0+ #97 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x38/0x50 mm/kasan/common.c:131 memcpy include/linux/string.h:352 [inline] rose_create_facilities net/rose/rose_subr.c:521 [inline] rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116 rose_connect+0x7cb/0x1510 net/rose/af_rose.c:826 __sys_connect+0x266/0x330 net/socket.c:1685 __do_sys_connect net/socket.c:1696 [inline] __se_sys_connect net/socket.c:1693 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1693 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458079 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f47b8d9dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458079 RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000004 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f47b8d9e6d4 R13: 00000000004be4a4 R14: 00000000004ceca8 R15: 00000000ffffffff The buggy address belongs to the page: page:ffffea00022c7fc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x1fffc0000000000() raw: 01fffc0000000000 0000000000000000 ffffffff022c0101 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88808b1ffa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88808b1ffb00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 03 >ffff88808b1ffb80: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 04 f3 ^ ffff88808b1ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 ffff88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
OpenPOWER on IntegriCloud