Subject: [GIT PULL] io_uring updates for 6.12-rc1 From: Jens Axboe To: Linus Torvalds Cc: io-uring Date: Fri, 13 Sep 2024 11:02:06 -0600 Hi Linus, Here are the main io_uring changes for the 6.12 merge window. There will be a followup one that adds discard support, but since that depends on both this branch and the block branch, it'll be sent post both of those. This pull request contains: - NAPI fixes and cleanups (Pavel, Olivier) - Add support for absolute timeouts (Pavel) - Fixes for io-wq/sqpoll affinities (Felix) - Efficiency improvements for dealing with huge pages (Chenliang) - Support for a minwait mode, where the application essentially has two timouts - one smaller one that defines the batch timeout, and the overall large one similar to what we had before. This enables efficient use of batching based on count + timeout, while still working well with periods of less intensive workloads. - Use ITER_UBUF for single segment sends - Add support for incremental buffer consumption. Right now each operation will always consume a full buffer. With incremental consumption, a recv/read operation only consumes the part of the buffer that it needs to satisfy the operation. - Add support for GCOV for io_uring, to help retain a high coverage of test to code ratio. - Fix regression with ocfs2, where an odd -EOPNOTSUPP wasn't correctly converted to a blocking retry. - Add support for cloning registered buffers from one ring to another. - Misc cleanups (Anuj, me) Please pull! The following changes since commit 5be63fc19fcaa4c236b307420483578a56986a37: Linux 6.11-rc5 (2024-08-25 19:07:11 +1200) are available in the Git repository at: git://git.kernel.dk/linux.git tags/for-6.12/io_uring-20240913 for you to fetch changes up to 7cc2a6eadcd7a5aa36ac63e6659f5c6138c7f4d2: io_uring: add IORING_REGISTER_COPY_BUFFERS method (2024-09-12 10:14:15 -0600) ---------------------------------------------------------------- for-6.12/io_uring-20240913 ---------------------------------------------------------------- Anuj Gupta (2): io_uring: add new line after variable declaration io_uring: remove unused rsrc_put_fn Chenliang Li (2): io_uring/rsrc: store folio shift and mask into imu io_uring/rsrc: enable multi-hugepage buffer coalescing Felix Moessbauer (3): io_uring/sqpoll: do not allow pinning outside of cpuset io_uring/io-wq: do not allow pinning outside of cpuset io_uring/io-wq: inherit cpuset of cgroup in io worker Jens Axboe (22): io_uring/kbuf: use 'bl' directly rather than req->buf_list io_uring/net: use ITER_UBUF for single segment send maps io_uring/kbuf: turn io_buffer_list booleans into flags io_uring: encapsulate extraneous wait flags into a separate struct io_uring: move schedule wait logic into helper io_uring: implement our own schedule timeout handling io_uring: add support for batch wait timeout io_uring: wire up min batch wake timeout io_uring/kbuf: shrink nr_iovs/mode in struct buf_sel_arg io_uring/kbuf: add io_kbuf_commit() helper io_uring/kbuf: move io_ring_head_to_buf() to kbuf.h Revert "io_uring: Require zeroed sqe->len on provided-buffers send" io_uring/kbuf: pass in 'len' argument for buffer commit io_uring/kbuf: add support for incremental buffer consumption io_uring: add GCOV_PROFILE_URING Kconfig option io_uring/eventfd: move refs to refcount_t io_uring/rw: treat -EOPNOTSUPP for IOCB_NOWAIT like -EAGAIN io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common() io_uring/rsrc: clear 'slot' entry upfront io_uring/rsrc: add reference count to struct io_mapped_ubuf io_uring/register: provide helper to get io_ring_ctx from 'fd' io_uring: add IORING_REGISTER_COPY_BUFFERS method Olivier Langlois (2): io_uring: add napi busy settings to the fdinfo output io_uring: micro optimization of __io_sq_thread() condition Pavel Begunkov (4): io_uring/napi: refactor __io_napi_busy_loop() io_uring/napi: postpone napi timeout adjustment io_uring: add absolute mode wait timeouts io_uring: user registered clockid for wait timeouts include/linux/io_uring_types.h | 3 + include/uapi/linux/io_uring.h | 42 ++++++- init/Kconfig | 13 +++ io_uring/Makefile | 4 + io_uring/eventfd.c | 13 ++- io_uring/fdinfo.c | 14 ++- io_uring/io-wq.c | 25 ++++- io_uring/io_uring.c | 212 ++++++++++++++++++++++++++--------- io_uring/io_uring.h | 12 ++ io_uring/kbuf.c | 96 ++++++++-------- io_uring/kbuf.h | 94 +++++++++++----- io_uring/napi.c | 35 ++---- io_uring/napi.h | 16 --- io_uring/net.c | 27 +++-- io_uring/register.c | 91 +++++++++++---- io_uring/register.h | 1 + io_uring/rsrc.c | 245 ++++++++++++++++++++++++++++++++++------- io_uring/rsrc.h | 14 ++- io_uring/rw.c | 19 +++- io_uring/sqpoll.c | 7 +- 20 files changed, 723 insertions(+), 260 deletions(-) -- Jens Axboe . Subject: [GIT PULL] io_uring async discard support From: Jens Axboe To: Linus Torvalds Cc: io-uring , "linux-block@vger.kernel.org" Date: Fri, 13 Sep 2024 11:02:24 -0600 Hi Linus, As mentioned, and sitting on top of both the 6.12 block and io_uring core branches, here's support for async discard through io_uring. This allows applications to issue async discards, rather than rely on the blocking sync ioctl discards we already have. The sync support is difficult to use outside of idle/cleanup periods. On a real (but slow) device, testing shows the following results when compared to sync discard: qd64 sync discard: 21K IOPS, lat avg 3 msec (max 21 msec) qd64 async discard: 76K IOPS, lat avg 845 usec (max 2.2 msec) qd64 sync discard: 14K IOPS, lat avg 5 msec (max 25 msec) qd64 async discard: 56K IOPS, lat avg 1153 usec (max 3.6 msec) and synthetic null_blk testing with the same queue depth and block size settings as above shows: Type Trim size IOPS Lat avg (usec) Lat Max (usec) ============================================================== sync 4k 144K 444 20314 async 4k 1353K 47 595 sync 1M 56K 1136 21031 async 1M 94K 680 760 Please pull! The following changes since commit 84eacf177faa605853c58e5b1c0d9544b88c16fd: io_uring/io-wq: inherit cpuset of cgroup in io worker (2024-09-11 07:27:56 -0600) are available in the Git repository at: git://git.kernel.dk/linux.git tags/for-6.12/io_uring-discard-20240913 for you to fetch changes up to 50c52250e2d74b098465841163c18f4b4e9ad430: block: implement async io_uring discard cmd (2024-09-11 10:45:28 -0600) ---------------------------------------------------------------- for-6.12/io_uring-discard-20240913 ---------------------------------------------------------------- Jens Axboe (2): Merge branch 'for-6.12/block' into for-6.12/io_uring-discard Merge branch 'for-6.12/io_uring' into for-6.12/io_uring-discard Pavel Begunkov (5): io_uring/cmd: expose iowq to cmds io_uring/cmd: give inline space in request to cmds filemap: introduce filemap_invalidate_pages block: introduce blk_validate_byte_range() block: implement async io_uring discard cmd block/blk.h | 1 + block/fops.c | 2 + block/ioctl.c | 163 +++++++++++++++++++++++++++++++---- include/linux/io_uring/cmd.h | 15 ++++ include/linux/pagemap.h | 2 + include/uapi/linux/blkdev.h | 14 +++ io_uring/io_uring.c | 11 +++ io_uring/io_uring.h | 1 + io_uring/uring_cmd.c | 7 ++ mm/filemap.c | 17 ++-- 10 files changed, 209 insertions(+), 24 deletions(-) -- Jens Axboe .