From: Luis Chamberlain <mcgrof@kernel.org>
To: p.raghav@samsung.com,
	hare@suse.de,
	kbusch@kernel.org,
	david@fromorbit.com,
	neilb@suse.de
Cc: mcgrof@kernel.org,
	gost.dev@samsung.com,
	linux-block@vger.kernel.org,
	linux-mm@kvack.org,
	patches@lists.linux.dev
Subject: [RFC] swapfile: disable swapon for bs > ps devices
Date: Wed, 26 Jun 2024 17:09:23 -0700
Message-ID: <20240627000924.2074949-1-mcgrof@kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: Luis Chamberlain <mcgrof@infradead.org>
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93527 org.kvack.linux-mm:201557
Newsgroups: org.kernel.vger.linux-block,dev.linux.lists.patches,org.kvack.linux-mm
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

Devices which have a requirement for bs > ps cannot be supported for
swap as swap still needs work. Once the block device cache sets the
min order for block devices [0] we need this stop gap otherwise all
swap operations are rejected.

[0] https://lore.kernel.org/all/20240510102906.51844-6-hare@kernel.org/T/#md09501306c649dd84db0a711f9359570c17a197f

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---

This is super *way* forward looking after LBS patches and once we square away
how to support things on the block device cache. Only then does it make
sense to start to consider this. But this is just a stop gap.

But if you think about it, in practice since we are going forward with a
world where we have AWUPF >= NPWG to enable the physical_block_size to
be >= NPWG, the corner case we want to help users *try* to avoid is to enable
swap not when the LBA format is > PAGE_SIZE (although for sport we can
support that) but when the NPWG > PAGE_SIZE. So we'd warn about that until
swap gets a facelift. That is 4k writes will work for devices with 4k
LBA format for example but NPWG = 16k, they would work with a RMW
penalty, just as RMWs today happen with drives formatted with 512 LBA
format and today's default world of 4k IU.

As it turns out we have no topology information for the IU today.  It used
to be that physical_block_size used to have a language about RMW.
During the 2024 LSFMM thread about Large Block for IO that Hannes
proposed we reviewed this discrepancy [1] but we seemed to conclude then
that no changes are required.

I'm starting to think that exposing the IU might make sense now. The
below would not capture the case of the IU > PAGE_SIZE, in theory that
should work but then its just RMWs, but users likely should be informed
it is stupid for them to do that. The other more important use case
would be for STATX_DIOALIGN for the dio_offset_align. That seems
incorrect today even for existing drives with 4k IU and 512 LBA format.

Thoughts?

[1] https://lore.kernel.org/all/ZekfZdchUnRZoebo@bombadil.infradead.org/

 mm/swapfile.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2f5203aa2d2c..9ff168760bc2 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3153,6 +3153,11 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 		goto bad_swap_unlock_inode;
 	}
 
+	if (mapping_min_folio_order(mapping) > 0) {
+		error = -EINVAL;
+		goto bad_swap_unlock_inode;
+	}
+
 	/*
 	 * Read the swap header.
 	 */
-- 
2.43.0

.

From: Li Lingfeng <lilingfeng@huaweicloud.com>
To: tj@kernel.org,
	josef@toxicpanda.com,
	hch@lst.de,
	axboe@kernel.dk
Cc: longman@redhat.com,
	ming.lei@redhat.com,
	cgroups@vger.kernel.org,
	linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	yangerkun@huawei.com,
	yukuai1@huaweicloud.com,
	houtao1@huawei.com,
	yi.zhang@huawei.com,
	lilingfeng@huaweicloud.com,
	lilingfeng3@huawei.com
Subject: [PATCH] blk-cgroup: don't clear stat in blkcg_reset_stats()
Date: Thu, 27 Jun 2024 17:08:56 +0800
Message-Id: <20240627090856.2345018-1-lilingfeng@huaweicloud.com>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93540 org.kernel.vger.linux-kernel:1260624
Newsgroups: org.kernel.vger.linux-block,org.kernel.vger.cgroups,org.kernel.vger.linux-kernel
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

From: Li Lingfeng <lilingfeng3@huawei.com>

The list corruption described in commit 6da668063279 ("blk-cgroup: fix
list corruption from resetting io stat") has no effect. It's unnecessary
to fix it.

As for cgroup v1, it does not use iostat any more after commit
ad7c3b41e86b("blk-throttle: Fix io statistics for cgroup v1"), so using
memset to clear iostat has no real effect.
As for cgroup v2, it will not call blkcg_reset_stats() to corrupt the
list.

The list of root cgroup can be used by both cgroup v1 and v2 while
non-root cgroup can't since it must be removed before switch between
cgroup v1 and v2.
So it may has effect if the list of root used by cgroup v2 was corrupted
after switching to cgroup v1, and switch back to cgroup v2 to use the
corrupted list again.
However, the root cgroup will not use the list any more after commit
ef45fe470e1e("blk-cgroup: show global disk stats in root cgroup io.stat").

Although this has no negative effect, it is not necessary. Remove the
related code.

Fixes: 6da668063279 ("blk-cgroup: fix list corruption from resetting io stat")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
 block/blk-cgroup.c | 24 ------------------------
 1 file changed, 24 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 37e6cc91d576..1113c398a742 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -629,29 +629,6 @@ static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *src)
 	}
 }
 
-static void __blkg_clear_stat(struct blkg_iostat_set *bis)
-{
-	struct blkg_iostat cur = {0};
-	unsigned long flags;
-
-	flags = u64_stats_update_begin_irqsave(&bis->sync);
-	blkg_iostat_set(&bis->cur, &cur);
-	blkg_iostat_set(&bis->last, &cur);
-	u64_stats_update_end_irqrestore(&bis->sync, flags);
-}
-
-static void blkg_clear_stat(struct blkcg_gq *blkg)
-{
-	int cpu;
-
-	for_each_possible_cpu(cpu) {
-		struct blkg_iostat_set *s = per_cpu_ptr(blkg->iostat_cpu, cpu);
-
-		__blkg_clear_stat(s);
-	}
-	__blkg_clear_stat(&blkg->iostat);
-}
-
 static int blkcg_reset_stats(struct cgroup_subsys_state *css,
 			     struct cftype *cftype, u64 val)
 {
@@ -668,7 +645,6 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
 	 * anyway.  If you get hit by a race, retry.
 	 */
 	hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
-		blkg_clear_stat(blkg);
 		for (i = 0; i < BLKCG_MAX_POLS; i++) {
 			struct blkcg_policy *pol = blkcg_policy[i];
 
-- 
2.31.1

.

From: Daniel Wagner <dwagner@suse.de>
To: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Hannes Reinecke <hare@suse.de>,
	linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org,
	Daniel Wagner <dwagner@suse.de>
Subject: [PATCH blktests v3 0/3] Add support to run against real target
Date: Thu, 27 Jun 2024 11:10:13 +0200
Message-ID: <20240627091016.12752-1-dwagner@suse.de>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93541
Newsgroups: org.kernel.vger.linux-block
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

I've added a new hook so that the default variables can be configured via t=
he
script. This simple overwrite of the defaults allows to use external config=
ured
setups (there is some trickery involved as it's not possible to do it only =
once
due to include orders). The upside of this approach is that we don't have t=
o add
more environment variables.

I've run blktests against a PowerStore. That worked fairly okay but there w=
ere
some fallouts which is kind of expected at this stage:

# cat ~/.config/blktests/nvme_target_control.toml
[main]
skip_setup_cleanup=3Dtrue
nvmetcli=3D'/home/wagi/work/nvmetcli/nvmetcli'
remote=3D'http://nvmet:5000'

[host]
blkdev_type=3D'device'
trtype=3D'tcp'
hostnqn=3D'nqn.2014-08.org.nvmexpress:uuid:1a9e23dd-466e-45ca-9f43-a29aaf47=
cb21'
hostid=3D'1a9e23dd-466e-45ca-9f43-a29aaf47cb21'
host_traddr=3D'10.161.16.48'

[subsys_0]
traddr=3D'10.162.198.45'
trsvid=3D'4420'
subsysnqn=3D'nqn.1988-11.com.dell:powerstore:00:f03028e73ef7D032D81E'
subsys_uuid=3D'3a5c104c-ee41-38a1-8ccf-0968003d54e7'


# NVME_TARGET_CONTROL=3D/root/blktests/contrib/nvme_target_control.py ./che=
ck nvme

nvme/002 (tr=3Dtcp) (create many subsystems and test discovery) [not run]
    nvme_trtype=3Dtcp is not supported in this test
nvme/003 (tr=3Dtcp) (test if we're sending keep-alives to a discovery contr=
oller)
nvme/003 (tr=3Dtcp) (test if we're sending keep-alives to a discovery contr=
oller) [passed]
    runtime    ...  15.397s
nvme/004 (tr=3Dtcp) (test nvme and nvmet UUID NS descriptors)  [failed]
    runtime    ...  42.584s
    --- tests/nvme/004.out      2024-06-27 09:45:35.496518067 +0200
    +++ /root/blktests/results/nodev_tr_tcp/nvme/004.out.bad    2024-06-27 =
10:38:59.424409636 +0200
    @@ -1,3 +1,4 @@
     Running nvme/004
    -disconnected 1 controller(s)
    +No namespaces found
    +disconnected 13 controller(s)
     Test complete
nvme/005 (tr=3Dtcp) (reset local loopback target)              [passed]
    runtime    ...  11.160s
nvme/006 (tr=3Dtcp bd=3Ddevice) (create an NVMeOF target)        [passed]
    runtime    ...  1.350s
nvme/008 (tr=3Dtcp bd=3Ddevice) (create an NVMeOF host)          [failed]
    runtime    ...  8.748s
    --- tests/nvme/008.out      2024-06-27 09:45:35.496518067 +0200
    +++ /root/blktests/results/nodev_tr_tcp_bd_device/nvme/008.out.bad  202=
4-06-27 10:39:23.624408817 +0200
    @@ -1,3 +1,4 @@
     Running nvme/008
    +UUID 3a5c104c-ee41-38a1-8ccf-0968003d54e7 mismatch (wwid eui.3a5c104ce=
e4138a18ccf0968003d54e7)
     disconnected 1 controller(s)
     Test complete
nvme/010 (tr=3Dtcp bd=3Ddevice) (run data verification fio job)  [passed]
    runtime    ...  29.798s
nnvme/012 (tr=3Dtcp bd=3Ddevice) (run mkfs and data verification fio) [fail=
ed]
    runtime    ...  162.299s
    --- tests/nvme/012.out      2024-06-27 09:45:35.500518066 +0200
    +++ /root/blktests/results/nodev_tr_tcp_bd_device/nvme/012.out.bad  202=
4-06-27 10:42:38.008402238 +0200
    @@ -1,3 +1,6 @@
     Running nvme/012
    +fio: io_u error on file /mnt/blktests//verify.0.0: No space left on de=
vice: write offset=3D44917682176, buflen=3D4096
    +fio exited with status 1
    +4;fio-3.23;verify;0;28;0;0;0;0;0;0;0.000000;0.000000;0;0;0.000000;0.00=
0000;1.000000%=3D0;5.000000%=3D0;10.000000%=3D0;20.000000%=3D0;30.000000%=
=3D0;40.000000%=3D0;50.000000%=3D0;60.000000%=3D0;70.000000%=3D0;80.000000%=
=3D0;90.000000%=3D0;95.000000%=3D0;99.000000%=3D0;99.500000%=3D0;99.900000%=
=3D0;99.950000%=3D0;99.990000%=3D0;0%=3D0;0%=3D0;0%=3D0;0;0;0.000000;0.0000=
00;0;0;0.000000%;0.000000;0.000000;3508332;24662;6165;142251;5;19501;35.466=
984;172.426891;127;60177;2556.665370;1420.047447;1.000000%=3D1056;5.000000%=
=3D1548;10.000000%=3D1646;20.000000%=3D1777;30.000000%=3D1908;40.000000%=3D=
2072;50.000000%=3D2277;60.000000%=3D2441;70.000000%=3D2637;80.000000%=3D293=
2;90.000000%=3D3653;95.000000%=3D4554;99.000000%=3D8224;99.500000%=3D10027;=
99.900000%=3D14745;99.950000%=3D17956;99.990000%=3D39583;0%=3D0;0%=3D0;0%=
=3D0;418;60193;2592.427453;1433.177059;18632;44736;100.000000%;24704.690141=
;4469.952445;0;0;0;0;0;0;0.000000;0.000000;0;0;0.000000;0.000000;1.000000%=
=3D0;5.000000%=3D0;10.000000%=3D0;20.000000%=3D0;30.000000%=3D0;40.000000%=
=3D0;50.000000%=3D0;60.000000%=3D0;70.000000%=3D0;80.000000%=3D0;90.000000%=
=3D0;95.000000%=3D0;99.000000%=3D0;99.500000%=3D0;99.900000%=3D0;99.950000%=
=3D0;99.990000%=3D0;0%=3D0;0%=3D0;0%=3D0;0;0;0.000000;0.000000;0;0;0.000000=
%;0.000000;0.000000;5.627417%;9.681547%;711749;0;22200;0.1%;0.1%;0.1%;0.1%;=
100.0%;0.0%;0.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.12%;0.29%;0.43=
%;35.65%;56.04%;6.95%;0.47%;0.03%;0.01%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%=
;nvme0n1;0;1739486;0;0;0;4596624;4596624;100.00%
     disconnected 1 controller(s)
     Test complete
nvme/014 (tr=3Dtcp bd=3Ddevice) (flush a command from host)
^C^C^C

The flush test hanged forever but this could just be an outdated host kerne=
l.

changes:
v3:
  - added support for previous configured target
  - renamed nvme_nvme_target to	_require_kernel_nvme_target
  - use shorter redirect operator
  - https://lore.kernel.org/all/20240612110444.4507-1-dwagner@suse.de/
v2:
  - many of the preperation patches have been merged, drop them
  - added a python script which implements the blktests API
  - add some documentation how to use it
  - changed the casing of the environment variables to upper case

v1:
  - initial version
  - https://lore.kernel.org/linux-nvme/20240318093856.22307-1-dwagner@suse.=
de/

Daniel Wagner (3):
  nvme/rc: introduce remote target support
  nvme/030: only run against kernel soft target
  contrib: add remote target setup/cleanup script

 Documentation/running-tests.md |  33 ++++++
 check                          |   4 +
 contrib/nvme_target_control.py | 181 +++++++++++++++++++++++++++++++++
 contrib/nvmet-subsys.jinja2    |  71 +++++++++++++
 tests/nvme/030                 |   1 +
 tests/nvme/rc                  |  65 +++++++++++-
 6 files changed, 353 insertions(+), 2 deletions(-)
 create mode 100755 contrib/nvme_target_control.py
 create mode 100644 contrib/nvmet-subsys.jinja2

--=20
2.45.2

.

From: Christoph Hellwig <hch@lst.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Subject: de-duplicate the block sysfs code
Date: Thu, 27 Jun 2024 13:14:00 +0200
Message-ID: <20240627111407.476276-1-hch@lst.de>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93551
Newsgroups: org.kernel.vger.linux-block
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

Hi Jens,

this series adds a few helpers to de-duplicate the block sysfs code,
and then switches it to operate on then gendisk, which is the object that
the kobject is embedded into.

Diffstat:
 block/blk-sysfs.c      |  389 ++++++++++++++++++-------------------------------
 block/elevator.c       |    9 -
 block/elevator.h       |    4 
 include/linux/blkdev.h |    7 
 4 files changed, 156 insertions(+), 253 deletions(-)
.

From: Christoph Hellwig <hch@lst.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>,
	"Md. Haris Iqbal" <haris.iqbal@ionos.com>,
	Jack Wang <jinpu.wang@ionos.com>,
	Kashyap Desai <kashyap.desai@broadcom.com>,
	Sumit Saxena <sumit.saxena@broadcom.com>,
	Shivasharan S <shivasharan.srikanteshwara@broadcom.com>,
	Chandrakanth patil <chandrakanth.patil@broadcom.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Sathya Prakash <sathya.prakash@broadcom.com>,
	Sreekanth Reddy <sreekanth.reddy@broadcom.com>,
	Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>,
	linux-block@vger.kernel.org,
	megaraidlinux.pdl@broadcom.com,
	linux-scsi@vger.kernel.org,
	MPT-FusionLinux.pdl@broadcom.com
Subject: get drivers out of setting queue flags
Date: Thu, 27 Jun 2024 14:49:10 +0200
Message-ID: <20240627124926.512662-1-hch@lst.de>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93559
Newsgroups: org.kernel.vger.linux-block,org.kernel.vger.linux-scsi
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

Hi all,

now that driver features have been moved out of the queue flags,
the abuses where drivers set random internal queue flags stand out
even more.  This series fixes them up.

Diffstat:
 block/loop.c                      |   15 ++-------------
 block/rnbd/rnbd-clt.c             |    2 --
 scsi/megaraid/megaraid_sas_base.c |    2 --
 scsi/mpt3sas/mpt3sas_scsih.c      |    6 ------
 4 files changed, 2 insertions(+), 23 deletions(-)
.

From: Kundan Kumar <kundan.kumar@samsung.com>
To: axboe@kernel.dk, hch@lst.de, willy@infradead.org, kbusch@kernel.org
Cc: linux-block@vger.kernel.org, joshi.k@samsung.com, mcgrof@kernel.org,
	anuj20.g@samsung.com, nj.shetty@samsung.com, c.gameti@samsung.com,
	gost.dev@samsung.com, Kundan Kumar <kundan.kumar@samsung.com>
Subject: [PATCH v6 0/3] block: add larger order folio instead of pages
Date: Thu, 27 Jun 2024 16:15:49 +0530
Message-Id: <20240627104552.11177-1-kundan.kumar@samsung.com>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"
References: <CGME20240627105359epcas5p47eb8839f9e77c85cb737be9434cb2570@epcas5p4.samsung.com>
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93567
Newsgroups: org.kernel.vger.linux-block
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

User space memory is mapped in kernel in form of pages array. These pages
are iterated and added to BIO. In process, pages are also checked for
contiguity and merged.

When mTHP is enabled the pages generally belong to larger order folio. This
patch series enables adding large folio to bio. It fetches folio for
page in the page array. The page might start from an offset in the folio
which could be multiples of PAGE_SIZE. Subsequent pages in page array
might belong to same folio. Using the length of folio, folio_offset and
remaining size, determine length in folio which can be added to the bio.
Check if pages are contiguous and belong to same folio. If yes then skip
further processing for the contiguous pages.

This complete scheme reduces the overhead of iterating through pages.

perf diff before and after this change(with mTHP enabled):

Perf diff for write I/O with 128K block size:
    1.26%     -0.27%  [kernel.kallsyms]  [k] bio_iov_iter_get_pages
    1.78%             [kernel.kallsyms]  [k] bvec_try_merge_page
Perf diff for read I/O with 128K block size:
    3.90%     -1.51%  [kernel.kallsyms]  [k] bio_iov_iter_get_pages
    5.12%             [kernel.kallsyms]  [k] bvec_try_merge_page

Patch 1: Adds folio-lized version of bio_add_hw_page()
Patch 2: Adds changes to add larger order folio to BIO
Patch 3: Unpin user pages belonging to folio at once

Changes since v5:
- Made offset and len as size_t in function bio_add_hw_folio()
- Avoid unpinning skipped pages at submission, rather unpin all pages at
  once on IO completion

Changes since v4:
- folio-lize bio_add_hw_page() to bio_add_hw_folio()
- make bio_add_hw_page() as a wrapper around bio_add_hw_folio()
- make new functions bio_release_folio() and unpin_user_folio()
- made a helper function to check for contiguous pages of folio
- changed &folio->page to folio_page(folio, 0)
- reworded comments

Changes since v3:
- Added change to see if pages are contiguous and belong to same folio.
  If not then avoid skipping of pages.(Suggested by Matthew Wilcox)

Changes since v2:
- Made separate patches
- Corrected code as per kernel coding style
- Removed size_folio variable

Changes since v1:
- Changed functions bio_iov_add_page() and bio_iov_add_zone_append_page()
  to accept a folio
- Removed branch and calculate folio_offset and len in same fashion for
  both 0 order and larger folios
- Added change in NVMe driver to use nvme_setup_prp_simple() by
  ignoring multiples of PAGE_SIZE in offset
- Added a change to unpin_user_pages which were added as folios. Also
  stopped the unpin of pages one by one from __bio_release_pages()
  (Suggested by Keith)

Kundan Kumar (3):
  block: Added folio-lized version of bio_add_hw_page()
  block: introduce folio awareness and add a bigger size from folio
  block: unpin user pages belonging to a folio at once

 block/bio.c        | 116 ++++++++++++++++++++++++++++++++++-----------
 block/blk.h        |  11 +++++
 include/linux/mm.h |   1 +
 mm/gup.c           |  13 +++++
 4 files changed, 114 insertions(+), 27 deletions(-)

-- 
2.25.1

.

Return-Path: <linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org>
From: Daniel Wagner <dwagner@suse.de>
Subject: [PATCH v2 0/3] nvme-pci: honor isolcpus configuration
Date: Thu, 27 Jun 2024 16:10:50 +0200
Message-Id: <20240627-isolcpus-io-queues-v2-0-26a32e3c4f75@suse.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: Jens Axboe <axboe@kernel.dk>, Keith Busch <kbusch@kernel.org>, 
 Sagi Grimberg <sagi@grimberg.me>, Thomas Gleixner <tglx@linutronix.de>, 
 Christoph Hellwig <hch@lst.de>
Cc: Frederic Weisbecker <fweisbecker@suse.com>, 
 Mel Gorman <mgorman@suse.de>, Hannes Reinecke <hare@suse.de>, 
 Sridhar Balaraman <sbalaraman@parallelwireless.com>, 
 "brookxu.cn" <brookxu.cn@gmail.com>, Ming Lei <ming.lei@redhat.com>, 
 linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, 
 linux-nvme@lists.infradead.org, Daniel Wagner <dwagner@suse.de>
X-BeenThere: linux-nvme@lists.infradead.org
X-Mailman-Version: 2.1.34
List-Id: <linux-nvme.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-nvme>,
 <mailto:linux-nvme-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-nvme/>
List-Post: <mailto:linux-nvme@lists.infradead.org>
List-Help: <mailto:linux-nvme-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-nvme>,
 <mailto:linux-nvme-request@lists.infradead.org?subject=subscribe>
Sender: "Linux-nvme" <linux-nvme-bounces@lists.infradead.org>
Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org
Xref: photonic.trudheim.com org.infradead.lists.linux-nvme:61561 org.kernel.vger.linux-block:93572 org.kernel.vger.linux-kernel:1260926
Newsgroups: org.infradead.lists.linux-nvme,org.kernel.vger.linux-block,org.kernel.vger.linux-kernel
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

I've dropped the io_queue type from housekeeping code, because of
11ea68f553e2 ("genirq, sched/isolation: Isolate from handling managed
interrupts"). This convienced me that the original goal of the
managed_irq argument was to move away any noise from the isolcpus. So
let's just use it and if there are real users of the current behavior we
can still add it back the. I hope that Ming will chime in eventually.

The rest of the changes are pretty small, splitting one patch and adding
documentation.

Initial cover letter:

The nvme-pci driver is ignoring the isolcpus configuration. There were
several attempts to fix this in the past [1][2]. This is another attempt
but this time trying to address the feedback and solve it in the core
code.

The first patch introduces a new option for isolcpus 'io_queue', but I'm
not really sure if this is needed and we could just use the managed_irq
option instead. I guess depends if there is an use case which depens on
queues on the isolated CPUs.

The second patch introduces a new block layer helper which returns the
number of possible queues. I suspect it would makes sense also to make
this helper a bit smarter and also consider the number of queues the
hardware supports.

And the last patch updates the group_cpus_evenly function so that it uses
only the housekeeping CPUs when they are defined

Note this series is not addressing the affinity setting of the admin
queue (queue 0). I'd like to address this after we agreed on how to solve
this. Currently, the admin queue affinity can be controlled by the
irq_afffinity command line option, so there is at least a workaround for
it.

Baseline:

available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 1536 MB
node 0 free: 1227 MB
node 1 cpus: 4 5 6 7
node 1 size: 1729 MB
node 1 free: 1422 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

options nvme write_queues=4 poll_queues=4

55: 0 41 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 0-edge nvme0q0 affinity: 0-3
63: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 1-edge nvme0q1 affinity: 4-5
64: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 2-edge nvme0q2 affinity: 6-7
65: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 3-edge nvme0q3 affinity: 0-1
66: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 4-edge nvme0q4 affinity: 2-3
67: 0 0 0 0 24 0 0 0 PCI-MSIX-0000:00:05.0 5-edge nvme0q5 affinity: 4
68: 0 0 0 0 0 1 0 0 PCI-MSIX-0000:00:05.0 6-edge nvme0q6 affinity: 5
69: 0 0 0 0 0 0 41 0 PCI-MSIX-0000:00:05.0 7-edge nvme0q7 affinity: 6
70: 0 0 0 0 0 0 0 3 PCI-MSIX-0000:00:05.0 8-edge nvme0q8 affinity: 7
71: 1 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 9-edge nvme0q9 affinity: 0
72: 0 18 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 10-edge nvme0q10 affinity: 1
73: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 11-edge nvme0q11 affinity: 2
74: 0 0 0 3 0 0 0 0 PCI-MSIX-0000:00:05.0 12-edge nvme0q12 affinity: 3

queue mapping for /dev/nvme0n1
        hctx0: default 4 5
        hctx1: default 6 7
        hctx2: default 0 1
        hctx3: default 2 3
        hctx4: read 4
        hctx5: read 5
        hctx6: read 6
        hctx7: read 7
        hctx8: read 0
        hctx9: read 1
        hctx10: read 2
        hctx11: read 3
        hctx12: poll 4 5
        hctx13: poll 6 7
        hctx14: poll 0 1
        hctx15: poll 2 3

PCI name is 00:05.0: nvme0n1
        irq 55, cpu list 0-3, effective list 1
        irq 63, cpu list 4-5, effective list 5
        irq 64, cpu list 6-7, effective list 7
        irq 65, cpu list 0-1, effective list 1
        irq 66, cpu list 2-3, effective list 3
        irq 67, cpu list 4, effective list 4
        irq 68, cpu list 5, effective list 5
        irq 69, cpu list 6, effective list 6
        irq 70, cpu list 7, effective list 7
        irq 71, cpu list 0, effective list 0
        irq 72, cpu list 1, effective list 1
        irq 73, cpu list 2, effective list 2
        irq 74, cpu list 3, effective list 3

* patched:

48: 0 0 33 0 0 0 0 0 PCI-MSIX-0000:00:05.0 0-edge nvme0q0 affinity: 0-3
58: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 1-edge nvme0q1 affinity: 4
59: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 2-edge nvme0q2 affinity: 5
60: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 3-edge nvme0q3 affinity: 0
61: 0 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 4-edge nvme0q4 affinity: 1
62: 0 0 0 0 45 0 0 0 PCI-MSIX-0000:00:05.0 5-edge nvme0q5 affinity: 4
63: 0 0 0 0 0 12 0 0 PCI-MSIX-0000:00:05.0 6-edge nvme0q6 affinity: 5
64: 2 0 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 7-edge nvme0q7 affinity: 0
65: 0 35 0 0 0 0 0 0 PCI-MSIX-0000:00:05.0 8-edge nvme0q8 affinity: 1

queue mapping for /dev/nvme0n1
        hctx0: default 2 3 4 6 7
        hctx1: default 5
        hctx2: default 0
        hctx3: default 1
        hctx4: read 4
        hctx5: read 5
        hctx6: read 0
        hctx7: read 1
        hctx8: poll 4
        hctx9: poll 5
        hctx10: poll 0
        hctx11: poll 1

PCI name is 00:05.0: nvme0n1
        irq 48, cpu list 0-3, effective list 2
        irq 58, cpu list 4, effective list 4
        irq 59, cpu list 5, effective list 5
        irq 60, cpu list 0, effective list 0
        irq 61, cpu list 1, effective list 1
        irq 62, cpu list 4, effective list 4
        irq 63, cpu list 5, effective list 5
        irq 64, cpu list 0, effective list 0
        irq 65, cpu list 1, effective list 1

[1] https://lore.kernel.org/lkml/20220423054331.GA17823@lst.de/T/#m9939195a465accbf83187caf346167c4242e798d
[2] https://lore.kernel.org/linux-nvme/87fruci5nj.ffs@tglx/

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
Changes in v2:
- updated documentation
- splitted blk/nvme-pci patch
- dropped HK_TYPE_IO_QUEUE, use HK_TYPE_MANAGED_IRQ
- Link to v1: https://lore.kernel.org/r/20240621-isolcpus-io-queues-v1-0-8b169bf41083@suse.de

---
Daniel Wagner (3):
      blk-mq: add blk_mq_num_possible_queues helper
      nvme-pci: limit queue count to housekeeping CPUs
      lib/group_cpus.c: honor housekeeping config when grouping CPUs

 block/blk-mq-cpumap.c   | 20 +++++++++++++
 drivers/nvme/host/pci.c |  5 ++--
 include/linux/blk-mq.h  |  1 +
 lib/group_cpus.c        | 75 +++++++++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 97 insertions(+), 4 deletions(-)
---
base-commit: 6ba59ff4227927d3a8530fc2973b80e94b54d58f
change-id: 20240620-isolcpus-io-queues-1a88eb47ff8b

Best regards,
-- 
Daniel Wagner <dwagner@suse.de>


.

From: Li Lingfeng <lilingfeng@huaweicloud.com>
To: tj@kernel.org,
	josef@toxicpanda.com,
	hch@lst.de,
	axboe@kernel.dk,
	mkoutny@suse.com
Cc: cgroups@vger.kernel.org,
	linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	yangerkun@huawei.com,
	yukuai1@huaweicloud.com,
	houtao1@huawei.com,
	yi.zhang@huawei.com,
	lilingfeng@huaweicloud.com,
	lilingfeng3@huawei.com
Subject: [PATCH v2] block: flush all throttled bios when deleting the cgroup
Date: Thu, 27 Jun 2024 22:26:06 +0800
Message-Id: <20240627142606.3709394-1-lilingfeng@huaweicloud.com>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93575 org.kernel.vger.linux-kernel:1260960
Newsgroups: org.kernel.vger.linux-block,org.kernel.vger.cgroups,org.kernel.vger.linux-kernel
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

From: Li Lingfeng <lilingfeng3@huawei.com>

When a process migrates to another cgroup and the original cgroup is deleted,
the restrictions of throttled bios cannot be removed. If the restrictions
are set too low, it will take a long time to complete these bios.

Refer to the process of deleting a disk to remove the restrictions and
issue bios when deleting the cgroup.

This makes difference on the behavior of throttled bios:
Before: the limit of the throttled bios can't be changed and the bios will
complete under this limit;
Now: the limit will be canceled and the throttled bios will be flushed
immediately.

References:
https://lore.kernel.org/r/20220318130144.1066064-4-ming.lei@redhat.com

Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
  v1->v2:
    Use "flush" instead of "cancel";
    Add description of the affect of throttled bios.
 block/blk-throttle.c | 68 ++++++++++++++++++++++++++++----------------
 1 file changed, 44 insertions(+), 24 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index c1bf73f8c75d..a0e5b28951ca 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1534,6 +1534,42 @@ static void throtl_shutdown_wq(struct request_queue *q)
 	cancel_work_sync(&td->dispatch_work);
 }
 
+static void tg_cancel_bios(struct throtl_grp *tg)
+{
+	struct throtl_service_queue *sq = &tg->service_queue;
+
+	if (tg->flags & THROTL_TG_CANCELING)
+		return;
+	/*
+	 * Set the flag to make sure throtl_pending_timer_fn() won't
+	 * stop until all throttled bios are dispatched.
+	 */
+	tg->flags |= THROTL_TG_CANCELING;
+
+	/*
+	 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
+	 * will be inserted to service queue without THROTL_TG_PENDING
+	 * set in tg_update_disptime below. Then IO dispatched from
+	 * child in tg_dispatch_one_bio will trigger double insertion
+	 * and corrupt the tree.
+	 */
+	if (!(tg->flags & THROTL_TG_PENDING))
+		return;
+
+	/*
+	 * Update disptime after setting the above flag to make sure
+	 * throtl_select_dispatch() won't exit without dispatching.
+	 */
+	tg_update_disptime(tg);
+
+	throtl_schedule_pending_timer(sq, jiffies + 1);
+}
+
+static void throtl_pd_offline(struct blkg_policy_data *pd)
+{
+	tg_cancel_bios(pd_to_tg(pd));
+}
+
 struct blkcg_policy blkcg_policy_throtl = {
 	.dfl_cftypes		= throtl_files,
 	.legacy_cftypes		= throtl_legacy_files,
@@ -1541,6 +1577,7 @@ struct blkcg_policy blkcg_policy_throtl = {
 	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
+	.pd_offline_fn		= throtl_pd_offline,
 	.pd_free_fn		= throtl_pd_free,
 };
 
@@ -1561,32 +1598,15 @@ void blk_throtl_cancel_bios(struct gendisk *disk)
 	 */
 	rcu_read_lock();
 	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
-		struct throtl_grp *tg = blkg_to_tg(blkg);
-		struct throtl_service_queue *sq = &tg->service_queue;
-
-		/*
-		 * Set the flag to make sure throtl_pending_timer_fn() won't
-		 * stop until all throttled bios are dispatched.
-		 */
-		tg->flags |= THROTL_TG_CANCELING;
-
 		/*
-		 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
-		 * will be inserted to service queue without THROTL_TG_PENDING
-		 * set in tg_update_disptime below. Then IO dispatched from
-		 * child in tg_dispatch_one_bio will trigger double insertion
-		 * and corrupt the tree.
+		 * disk_release will call pd_offline_fn to cancel bios.
+		 * However, disk_release can't be called if someone get
+		 * the refcount of device and issued bios which are
+		 * inflight after del_gendisk.
+		 * Cancel bios here to ensure no bios are inflight after
+		 * del_gendisk.
 		 */
-		if (!(tg->flags & THROTL_TG_PENDING))
-			continue;
-
-		/*
-		 * Update disptime after setting the above flag to make sure
-		 * throtl_select_dispatch() won't exit without dispatching.
-		 */
-		tg_update_disptime(tg);
-
-		throtl_schedule_pending_timer(sq, jiffies + 1);
+		tg_cancel_bios(blkg_to_tg(blkg));
 	}
 	rcu_read_unlock();
 	spin_unlock_irq(&q->queue_lock);
-- 
2.39.2

.

From: John Garry <john.g.garry@oracle.com>
To: axboe@kernel.dk
Cc: linux-block@vger.kernel.org, hch@lst.de,
        John Garry <john.g.garry@oracle.com>
Subject: [PATCH] block: Delete blk_queue_flag_test_and_set()
Date: Thu, 27 Jun 2024 16:07:35 +0000
Message-Id: <20240627160735.842189-1-john.g.garry@oracle.com>
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93584
Newsgroups: org.kernel.vger.linux-block
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

Since commit 70200574cc22 ("block: remove QUEUE_FLAG_DISCARD"),
blk_queue_flag_test_and_set() has not been used, so delete it.

Signed-off-by: John Garry <john.g.garry@oracle.com>

diff --git a/block/blk-core.c b/block/blk-core.c
index 6fc1a5a1980d..71b7622c523a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -94,20 +94,6 @@ void blk_queue_flag_clear(unsigned int flag, struct request_queue *q)
 }
 EXPORT_SYMBOL(blk_queue_flag_clear);
 
-/**
- * blk_queue_flag_test_and_set - atomically test and set a queue flag
- * @flag: flag to be set
- * @q: request queue
- *
- * Returns the previous value of @flag - 0 if the flag was not set and 1 if
- * the flag was already set.
- */
-bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q)
-{
-	return test_and_set_bit(flag, &q->queue_flags);
-}
-EXPORT_SYMBOL_GPL(blk_queue_flag_test_and_set);
-
 #define REQ_OP_NAME(name) [REQ_OP_##name] = #name
 static const char *const blk_op_name[] = {
 	REQ_OP_NAME(READ),
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a53e3434e1a2..53c41ef4222c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -609,7 +609,6 @@ struct request_queue {
 
 void blk_queue_flag_set(unsigned int flag, struct request_queue *q);
 void blk_queue_flag_clear(unsigned int flag, struct request_queue *q);
-bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 
 #define blk_queue_stopped(q)	test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags)
 #define blk_queue_dying(q)	test_bit(QUEUE_FLAG_DYING, &(q)->queue_flags)
-- 
2.31.1

.

Date: Thu, 27 Jun 2024 21:49:38 +0100
From: Daniel Golle <daniel@makrotopia.org>
To: Rob Herring <robh@kernel.org>, Krzysztof Kozlowski <krzk+dt@kernel.org>,
	Conor Dooley <conor+dt@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>, Jens Axboe <axboe@kernel.dk>,
	Hauke Mehrtens <hauke@hauke-m.de>, Felix Fietkau <nbd@nbd.name>,
	Srinivas Kandagatla <srinivas.kandagatla@linaro.org>,
	Daniel Golle <daniel@makrotopia.org>,
	Dave Chinner <dchinner@redhat.com>, Jan Kara <jack@suse.cz>,
	Christian Brauner <brauner@kernel.org>,
	Thomas =?iso-8859-1?Q?Wei=DFschuh?= <linux@weissschuh.net>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Li Lingfeng <lilingfeng3@huawei.com>,
	Christian Heusel <christian@heusel.eu>,
	Min Li <min15.li@samsung.com>, Avri Altman <avri.altman@wdc.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Hannes Reinecke <hare@suse.de>,
	Mikko Rapeli <mikko.rapeli@linaro.org>, Yeqi Fu <asuk4.q@gmail.com>,
	Victor Shih <victor.shih@genesyslogic.com.tw>,
	Christophe JAILLET <christophe.jaillet@wanadoo.fr>,
	Li Zhijian <lizhijian@fujitsu.com>,
	"Ricardo B. Marliere" <ricardo@marliere.net>,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mmc@vger.kernel.org, linux-block@vger.kernel.org
Subject: [PATCH v4 0/4] block: preparations for NVMEM provider
Message-ID: <cover.1719520771.git.daniel@makrotopia.org>
X-Mailing-List: linux-block@vger.kernel.org
List-Id: <linux-block.vger.kernel.org>
List-Subscribe: <mailto:linux-block+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-block+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Xref: photonic.trudheim.com org.kernel.vger.linux-block:93592 org.kernel.vger.linux-kernel:1261427
Newsgroups: org.kernel.vger.linux-block,org.kernel.vger.linux-devicetree,org.kernel.vger.linux-kernel,org.kernel.vger.linux-mmc
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

On embedded devices using an eMMC it is common that one or more (hw/sw)
partitions on the eMMC are used to store MAC addresses and Wi-Fi
calibration EEPROM data.

Typically the NVMEM framework is used to have kernel drivers read and
use binary data from EEPROMs, efuses, flash memory (MTD), ...

Using references to NVMEM bits in Device Tree allows the kernel to
access and apply e.g. the Ethernet MAC address, which can be a requirement
for userland to come up (e.g. for nfsroot).

The goal of this series is to prepare the block subsystem to allow for
the implementation of an NVMEM provider similar to other types of
non-volatile storage, so the same approach already used for EEPROMs, MTD
(raw flashes) and UBI-managed NAND can also be used for devices storing
those bits on an eMMC.

Define a device tree schema for block devices and partitions on them,
which (similar to how it now works also for UBI volumes) can be matched
by one or more properties.

Also add a simple notification API for other subsystems to be notified
about additions and removals of block devices, which is going to be used
by the block-backed NVMEM provider.

Overall, this enables uniform handling across practially all flash
storage types used for this purpose (MTD, UBI, and soon also MMC or and
in future maybe also other block devices).
---
Changes since v3 sent on Jun 26th, addressing comments from Jens Axboe:
 - improve readability and error-handling in fwnode-matching code
 - remove forgotten code from earlier development accessing ddev->parent
 - use '#if defined' instead of '#ifdef' in header
 - provide inline dummies in case of CONFIG_BLOCK_NOTIFIERS not being set

Changes since v2 sent on May 30th 2024 [1] addressing comments from
Hauke Mehrtens (https://patchwork.kernel.org/comment/25892133/)
 - Check length of UUID and PARTNAME.
 - Remove forgotten fallback to get 'partitions' subnode from parent.
   It is no longer needed and was a left over from earlier development.
 - Split series into 3 parts, one for each affected subsystem. This is
   the first part covering only the changes needed in the block
   subsystem. The second part adds the actual nvmem provider to
   drivers/nvmem/, the third part is going to make use of it for MMC
   block devices and cover changes in drivers/mmc.

Changes since v1 sent on March 21st 2024 [2]:
 - introduce notifications for block device addition and removal for
   in-kernel users. This allows the nvmem driver to be built as a module
   and avoids using class_interface and block subsystem internals as
   suggested in https://patchwork.kernel.org/comment/25771998/ and
   https://patchwork.kernel.org/comment/25770441/

This series has previously been submitted as RFC on July 19th 2023[3]
and most of the basic idea did not change since. Another round of RFC
was submitted on March 5th 2024[4].

[1]: https://patchwork.kernel.org/project/linux-block/list/?series=857192
[2]: https://patchwork.kernel.org/project/linux-block/list/?series=837150&archive=both
[3]: https://patchwork.kernel.org/project/linux-block/list/?series=767565
[4]: https://patchwork.kernel.org/project/linux-block/list/?series=832705


Daniel Golle (4):
  dt-bindings: block: add basic bindings for block devices
  block: partitions: populate fwnode
  block: add support for notifications
  block: add new genhd flag GENHD_FL_NVMEM

 .../bindings/block/block-device.yaml          | 22 +++++
 .../devicetree/bindings/block/partition.yaml  | 51 +++++++++++
 .../devicetree/bindings/block/partitions.yaml | 20 +++++
 block/Kconfig                                 |  6 ++
 block/Makefile                                |  1 +
 block/blk-notify.c                            | 87 +++++++++++++++++++
 block/partitions/core.c                       | 70 +++++++++++++++
 include/linux/blkdev.h                        | 13 +++
 8 files changed, 270 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/block/block-device.yaml
 create mode 100644 Documentation/devicetree/bindings/block/partition.yaml
 create mode 100644 Documentation/devicetree/bindings/block/partitions.yaml
 create mode 100644 block/blk-notify.c

-- 
2.45.2
.