<feed xmlns='http://www.w3.org/2005/Atom'>
<title>blackbird-op-linux/kernel/seccomp.c, branch v4.2</title>
<subtitle>Blackbird™ Linux sources for OpenPOWER</subtitle>
<id>https://git.raptorcs.com/git/blackbird-op-linux/atom?h=v4.2</id>
<link rel='self' href='https://git.raptorcs.com/git/blackbird-op-linux/atom?h=v4.2'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/'/>
<updated>2015-05-09T21:35:05+00:00</updated>
<entry>
<title>seccomp, filter: add and use bpf_prog_create_from_user from seccomp</title>
<updated>2015-05-09T21:35:05+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-05-06T14:12:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=ac67eb2c5347bd9976308c0e0cf1d9e7ca690342'/>
<id>urn:sha1:ac67eb2c5347bd9976308c0e0cf1d9e7ca690342</id>
<content type='text'>
Seccomp has always been a special candidate when it comes to preparation
of its filters in seccomp_prepare_filter(). Due to the extra checks and
filter rewrite it partially duplicates code and has BPF internals exposed.

This patch adds a generic API inside the BPF code code that seccomp can use
and thus keep it's filter preparation code minimal and better maintainable.
The other side-effect is that now classic JITs can add seccomp support as
well by only providing a BPF_LDX | BPF_W | BPF_ABS translation.

Tested with seccomp and BPF test suites.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Nicolas Schichan &lt;nschichan@freebox.fr&gt;
Cc: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>seccomp: simplify seccomp_prepare_filter and reuse bpf_prepare_filter</title>
<updated>2015-05-09T21:35:05+00:00</updated>
<author>
<name>Nicolas Schichan</name>
<email>nschichan@freebox.fr</email>
</author>
<published>2015-05-06T14:12:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=d9e12f42e58da475379b9080708b94f2095904af'/>
<id>urn:sha1:d9e12f42e58da475379b9080708b94f2095904af</id>
<content type='text'>
Remove the calls to bpf_check_classic(), bpf_convert_filter() and
bpf_migrate_runtime() and let bpf_prepare_filter() take care of that
instead.

seccomp_check_filter() is passed to bpf_prepare_filter() so that it
gets called from there, after bpf_check_classic().

We can now remove exposure of two internal classic BPF functions
previously used by seccomp. The export of bpf_check_classic() symbol,
previously known as sk_chk_filter(), was there since pre git times,
and no in-tree module was using it, therefore remove it.

Joint work with Daniel Borkmann.

Signed-off-by: Nicolas Schichan &lt;nschichan@freebox.fr&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>seccomp: cap SECCOMP_RET_ERRNO data to MAX_ERRNO</title>
<updated>2015-02-17T22:34:55+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2015-02-17T21:48:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=580c57f1076872ebc2427f898b927944ce170f2d'/>
<id>urn:sha1:580c57f1076872ebc2427f898b927944ce170f2d</id>
<content type='text'>
The value resulting from the SECCOMP_RET_DATA mask could exceed MAX_ERRNO
when setting errno during a SECCOMP_RET_ERRNO filter action.  This makes
sure we have a reliable value being set, so that an invalid errno will not
be ignored by userspace.

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reported-by: Dmitry V. Levin &lt;ldv@altlinux.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Will Drewry &lt;wad@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'x86-seccomp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2014-10-14T00:27:06+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-14T00:27:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=ba1a96fc7ddcaf0c8d4a6752f6a70f080bc307ac'/>
<id>urn:sha1:ba1a96fc7ddcaf0c8d4a6752f6a70f080bc307ac</id>
<content type='text'>
Pull x86 seccomp changes from Ingo Molnar:
 "This tree includes x86 seccomp filter speedups and related preparatory
  work, which touches core seccomp facilities as well.

  The main idea is to split seccomp into two phases, to be able to enter
  a simple fast path for syscalls with ptrace side effects.

  There's no substantial user-visible (and ABI) effects expected from
  this, except a change in how we emit a better audit record for
  SECCOMP_RET_TRACE events"

* 'x86-seccomp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86_64, entry: Use split-phase syscall_trace_enter for 64-bit syscalls
  x86_64, entry: Treat regs-&gt;ax the same in fastpath and slowpath syscalls
  x86: Split syscall_trace_enter into two phases
  x86, entry: Only call user_exit if TIF_NOHZ
  x86, x32, audit: Fix x32's AUDIT_ARCH wrt audit
  seccomp: Document two-phase seccomp and arch-provided seccomp_data
  seccomp: Allow arch code to provide seccomp_data
  seccomp: Refactor the filter callback and the API
  seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computing
</content>
</entry>
<entry>
<title>net: bpf: make eBPF interpreter images read-only</title>
<updated>2014-09-05T19:02:48+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-09-02T20:53:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=60a3b2253c413cf601783b070507d7dd6620c954'/>
<id>urn:sha1:60a3b2253c413cf601783b070507d7dd6620c954</id>
<content type='text'>
With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate.  This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables.  Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.

Suggested-by: Brad Spengler &lt;spender@grsecurity.net&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>seccomp: Allow arch code to provide seccomp_data</title>
<updated>2014-09-03T21:58:17+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@amacapital.net</email>
</author>
<published>2014-07-22T01:49:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=d39bd00deabe57420f2a3669eb71b0e0c4997184'/>
<id>urn:sha1:d39bd00deabe57420f2a3669eb71b0e0c4997184</id>
<content type='text'>
populate_seccomp_data is expensive: it works by inspecting
task_pt_regs and various other bits to piece together all the
information, and it's does so in multiple partially redundant steps.

Arch-specific code in the syscall entry path can do much better.

Admittedly this adds a bit of additional room for error, but the
speedup should be worth it.

Signed-off-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Refactor the filter callback and the API</title>
<updated>2014-09-03T21:58:17+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@amacapital.net</email>
</author>
<published>2014-07-22T01:49:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=13aa72f0fd0a9f98a41cefb662487269e2f1ad65'/>
<id>urn:sha1:13aa72f0fd0a9f98a41cefb662487269e2f1ad65</id>
<content type='text'>
The reason I did this is to add a seccomp API that will be usable
for an x86 fast path.  The x86 entry code needs to use a rather
expensive slow path for a syscall that might be visible to things
like ptrace.  By splitting seccomp into two phases, we can check
whether we need the slow path and then use the fast path in if the
filter allows the syscall or just returns some errno.

As a side effect, I think the new code is much easier to understand
than the old code.

This has one user-visible effect: the audit record written for
SECCOMP_RET_TRACE is now a simple indication that SECCOMP_RET_TRACE
happened.  It used to depend in a complicated way on what the tracer
did.  I couldn't make much sense of it.

Signed-off-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computing</title>
<updated>2014-09-03T21:58:17+00:00</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@amacapital.net</email>
</author>
<published>2014-07-22T01:49:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=a4412fc9486ec85686c6c7929e7e829f62ae377e'/>
<id>urn:sha1:a4412fc9486ec85686c6c7929e7e829f62ae377e</id>
<content type='text'>
The secure_computing function took a syscall number parameter, but
it only paid any attention to that parameter if seccomp mode 1 was
enabled.  Rather than coming up with a kludge to get the parameter
to work in mode 2, just remove the parameter.

To avoid churn in arches that don't have seccomp filters (and may
not even support syscall_get_nr right now), this leaves the
parameter in secure_computing_strict, which is now a real function.

For ARM, this is a bit ugly due to the fact that ARM conditionally
supports seccomp filters.  Fixing that would probably only be a
couple of lines of code, but it should be coordinated with the audit
maintainers.

This will be a slight slowdown on some arches.  The right fix is to
pass in all of seccomp_data instead of trying to make just the
syscall nr part be fast.

This is a prerequisite for making two-phase seccomp work cleanly.

Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: linux-mips@linux-mips.org
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: linux-s390@vger.kernel.org
Cc: x86@kernel.org
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock</title>
<updated>2014-08-11T20:29:12+00:00</updated>
<author>
<name>Guenter Roeck</name>
<email>linux@roeck-us.net</email>
</author>
<published>2014-08-11T03:50:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=69f6a34bdeea4fec50bb90619bc9602973119572'/>
<id>urn:sha1:69f6a34bdeea4fec50bb90619bc9602973119572</id>
<content type='text'>
Current upstream kernel hangs with mips and powerpc targets in
uniprocessor mode if SECCOMP is configured.

Bisect points to commit dbd952127d11 ("seccomp: introduce writer locking").
Turns out that code such as
	BUG_ON(!spin_is_locked(&amp;list_lock));
can not be used in uniprocessor mode because spin_is_locked() always
returns false in this configuration, and that assert_spin_locked()
exists for that very purpose and must be used instead.

Fixes: dbd952127d11 ("seccomp: introduce writer locking")
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next</title>
<updated>2014-08-06T16:38:14+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-08-06T16:38:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/blackbird-op-linux/commit/?id=ae045e2455429c418a418a3376301a9e5753a0a8'/>
<id>urn:sha1:ae045e2455429c418a418a3376301a9e5753a0a8</id>
<content type='text'>
Pull networking updates from David Miller:
 "Highlights:

   1) Steady transitioning of the BPF instructure to a generic spot so
      all kernel subsystems can make use of it, from Alexei Starovoitov.

   2) SFC driver supports busy polling, from Alexandre Rames.

   3) Take advantage of hash table in UDP multicast delivery, from David
      Held.

   4) Lighten locking, in particular by getting rid of the LRU lists, in
      inet frag handling.  From Florian Westphal.

   5) Add support for various RFC6458 control messages in SCTP, from
      Geir Ola Vaagland.

   6) Allow to filter bridge forwarding database dumps by device, from
      Jamal Hadi Salim.

   7) virtio-net also now supports busy polling, from Jason Wang.

   8) Some low level optimization tweaks in pktgen from Jesper Dangaard
      Brouer.

   9) Add support for ipv6 address generation modes, so that userland
      can have some input into the process.  From Jiri Pirko.

  10) Consolidate common TCP connection request code in ipv4 and ipv6,
      from Octavian Purdila.

  11) New ARP packet logger in netfilter, from Pablo Neira Ayuso.

  12) Generic resizable RCU hash table, with intial users in netlink and
      nftables.  From Thomas Graf.

  13) Maintain a name assignment type so that userspace can see where a
      network device name came from (enumerated by kernel, assigned
      explicitly by userspace, etc.) From Tom Gundersen.

  14) Automatic flow label generation on transmit in ipv6, from Tom
      Herbert.

  15) New packet timestamping facilities from Willem de Bruijn, meant to
      assist in measuring latencies going into/out-of the packet
      scheduler, latency from TCP data transmission to ACK, etc"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits)
  cxgb4 : Disable recursive mailbox commands when enabling vi
  net: reduce USB network driver config options.
  tg3: Modify tg3_tso_bug() to handle multiple TX rings
  amd-xgbe: Perform phy connect/disconnect at dev open/stop
  amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask
  net: sun4i-emac: fix memory leak on bad packet
  sctp: fix possible seqlock seadlock in sctp_packet_transmit()
  Revert "net: phy: Set the driver when registering an MDIO bus device"
  cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine
  team: Simplify return path of team_newlink
  bridge: Update outdated comment on promiscuous mode
  net-timestamp: ACK timestamp for bytestreams
  net-timestamp: TCP timestamping
  net-timestamp: SCHED timestamp on entering packet scheduler
  net-timestamp: add key to disambiguate concurrent datagrams
  net-timestamp: move timestamp flags out of sk_flags
  net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
  cxgb4i : Move stray CPL definitions to cxgb4 driver
  tcp: reduce spurious retransmits due to transient SACK reneging
  qlcnic: Initialize dcbnl_ops before register_netdev
  ...
</content>
</entry>
</feed>
