blackbird-op-linux/arch/s390/net, branch master

s390/bpf: Remove JITed image size limitations

2019-11-19T03:51:16+00:00

Now that jump and long displacement ranges are no longer a problem, remove the limit on JITed image size. In practice it's still limited by 2G, but with verifier allowing "only" 1M instructions, it's not an issue. Signed-off-by: Ilya Leoshkevich Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191118180340.68373-7-iii@linux.ibm.com

s390/bpf: Use lg(f)rl when long displacement cannot be used

2019-11-19T03:51:16+00:00

If literal pool grows past 524287 mark, it's no longer possible to use long displacement to reference literal pool entries. In JIT setting maintaining multiple literal pool registers is next to impossible, since we operate on one instruction at a time. Therefore, fall back to loading literal pool entry using PC-relative addressing, and then using a register-register form of the following machine instruction. Signed-off-by: Ilya Leoshkevich Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191118180340.68373-6-iii@linux.ibm.com

s390/bpf: Use lgrl instead of lg where possible

2019-11-19T03:51:16+00:00

lg and lgrl have the same performance characteristics, but the former requires a base register and is subject to long displacement range limits, while the latter does not. Therefore, lgrl is totally superior to lg and should be used instead whenever possible. Signed-off-by: Ilya Leoshkevich Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191118180340.68373-5-iii@linux.ibm.com

s390/bpf: Load literal pool register using larl

2019-11-19T03:51:16+00:00

Currently literal pool register is loaded using basr, which makes it point not to the beginning of the literal pool, but rather to the next instruction. In case JITed code is larger than 512k, this renders literal pool register absolutely useless due to long displacement range restrictions. The solution is to use larl to make literal pool register point to the very beginning of the literal pool. This makes it always possible to address 512k worth of literal pool entries using long displacement. However, for short programs, in which the entire literal pool is covered by basr-generated base, it is still beneficial to use basr, since it is 4 bytes shorter than larl. Detect situations when basr-generated base does not cover the entire literal pool, and in such cases use larl instead. Signed-off-by: Ilya Leoshkevich Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191118180340.68373-4-iii@linux.ibm.com

s390/bpf: Align literal pool entries

2019-11-19T03:51:16+00:00

When literal pool size exceeds 512k, it's no longer possible to reference all the entries in it using a single base register and long displacement. Therefore, PC-relative lgfrl and lgrl instructions need to be used. Unfortunately, they require their arguments to be aligned to 4- and 8-byte boundaries respectively. This generates certain overhead due to necessary padding bytes. Grouping 4- and 8-byte entries together reduces the maximum overhead to 6 bytes (2 for aligning 4-byte entries and 4 for aligning 8-byte entries). While in theory it is possible to detect whether or not alignment is needed by comparing the literal pool size with 512k, in practice this leads to having two ways of emitting constants, making the code more complicated. Prefer code simplicity over trivial size saving, and always group and align literal pool entries. Signed-off-by: Ilya Leoshkevich Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191118180340.68373-3-iii@linux.ibm.com

s390/bpf: Use relative long branches

2019-11-19T03:51:16+00:00

Currently maximum JITed code size is limited to 64k, because JIT can emit only relative short branches, whose range is limited by 64k in both directions. Teach JIT to use relative long branches. There are no compare+branch relative long instructions, so using relative long branches consumes more space due to having to having to emit an explicit comparison instruction. Therefore do this only when relative short branch is not enough. Signed-off-by: Ilya Leoshkevich Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191118180340.68373-2-iii@linux.ibm.com

s390/bpf: Make sure JIT passes do not increase code size

2019-11-15T21:25:54+00:00

The upcoming s390 branch length extension patches rely on "passes do not increase code size" property in order to consistently choose between short and long branches. Currently this property does not hold between the first and the second passes for register save/restore sequences, as well as various code fragments that depend on SEEN_* flags. Generate the code during the first pass conservatively: assume register save/restore sequences have the maximum possible length, and that all SEEN_* flags are set. Also refuse to JIT if this happens anyway (e.g. due to a bug), as this might lead to verifier bypass once long branches are introduced. Signed-off-by: Ilya Leoshkevich Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20191114151820.53222-1-iii@linux.ibm.com

s390/bpf: Remove unused SEEN_RET0, SEEN_REG_AX and ret0_ip

2019-11-07T15:10:01+00:00

We don't need them since commit e1cf4befa297 ("bpf, s390x: remove ld_abs/ld_ind") and commit a3212b8f15d8 ("bpf, s390x: remove obsolete exception handling from div/mod"). Also, use BIT(n) instead of 1 << n, because checkpatch says so. Signed-off-by: Ilya Leoshkevich Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20191107114033.90505-1-iii@linux.ibm.com

s390/bpf: Wrap JIT macro parameter usages in parentheses

2019-11-07T15:07:28+00:00

This change does not alter JIT behavior; it only makes it possible to safely invoke JIT macros with complex arguments in the future. Signed-off-by: Ilya Leoshkevich Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20191107113211.90105-1-iii@linux.ibm.com

s390/bpf: Use kvcalloc for addrs array

2019-11-07T15:05:55+00:00

A BPF program may consist of 1m instructions, which means JIT instruction-address mapping can be as large as 4m. s390 has FORCE_MAX_ZONEORDER=9 (for memory hotplug reasons), which means maximum kmalloc size is 1m. This makes it impossible to JIT programs with more than 256k instructions. Fix by using kvcalloc, which falls back to vmalloc for larger allocations. An alternative would be to use a radix tree, but that is not supported by bpf_prog_fill_jited_linfo. Signed-off-by: Ilya Leoshkevich Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20191107141838.92202-1-iii@linux.ibm.com