From: Mike Snitzer <snitzer@kernel.org>
To: linux-nfs@vger.kernel.org
Cc: Jeff Layton <jlayton@kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	Anna Schumaker <anna@kernel.org>,
	Trond Myklebust <trondmy@hammerspace.com>,
	NeilBrown <neilb@suse.de>,
	snitzer@hammerspace.com
Subject: [PATCH v10 00/19] nfs/nfsd: add support for localio
Date: Sun, 30 Jun 2024 12:37:22 -0400
Message-ID: <20240630163741.48753-1-snitzer@kernel.org>
X-Mailing-List: linux-nfs@vger.kernel.org
List-Id: <linux-nfs.vger.kernel.org>
List-Subscribe: <mailto:linux-nfs+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-nfs+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Xref: photonic.trudheim.com org.kernel.vger.linux-nfs:86399
Newsgroups: org.kernel.vger.linux-nfs
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

Hi,

Changes since v9:
- Inverted series so that NFSD changes come before NFS.
- Addressed many of Chuck's various review comments for "[PATCH v9
  13/19] nfsd: add "localio" support"

TODO:
- Hopefully get a favorable response to this patch from XFS engineers:
  https://marc.info/?l=linux-xfs&m=171976530416523&w=2
  (otherwise, will need to revisit using dedicated workqueue patch)

All review and comments are welcome!

Thanks,
Mike

My git tree is here:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/

This v10 is both branch nfs-localio-for-6.11 (always tracks latest)
and nfs-localio-for-6.11.v10

Mike Snitzer (10):
  nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h
  nfs_common: add NFS LOCALIO auxiliary protocol enablement
  nfsd: add Kconfig options to allow localio to be enabled
  nfsd: manage netns reference in nfsd_open_local_fh
  nfsd: use percpu_ref to interlock nfsd_destroy_serv and nfsd_open_local_fh
  nfsd: implement server support for NFS_LOCALIO_PROGRAM
  nfs: fix nfs_localio_vfs_getattr() to properly support v4
  SUNRPC: remove call_allocate() BUG_ON if p_arglen=0 to allow RPC with void arg
  nfs: implement client support for NFS_LOCALIO_PROGRAM
  nfs: add Documentation/filesystems/nfs/localio.rst

NeilBrown (1):
  SUNRPC: replace program list with program array

Trond Myklebust (2):
  nfs: enable localio for non-pNFS I/O
  pnfs/flexfiles: enable localio for flexfiles I/O

Weston Andros Adamson (6):
  SUNRPC: add rpcauth_map_to_svc_cred_local
  nfsd: add "localio" support
  nfs: pass nfs_client to nfs_initiate_pgio
  nfs: pass descriptor thru nfs_initiate_pgio path
  nfs: pass struct file to nfs_init_pgio and nfs_init_commit
  nfs: add "localio" support

 Documentation/filesystems/nfs/localio.rst | 135 ++++
 fs/Kconfig                                |   3 +
 fs/nfs/Kconfig                            |  14 +
 fs/nfs/Makefile                           |   1 +
 fs/nfs/blocklayout/blocklayout.c          |   6 +-
 fs/nfs/client.c                           |  15 +-
 fs/nfs/filelayout/filelayout.c            |  16 +-
 fs/nfs/flexfilelayout/flexfilelayout.c    | 131 +++-
 fs/nfs/flexfilelayout/flexfilelayout.h    |   2 +
 fs/nfs/flexfilelayout/flexfilelayoutdev.c |   6 +
 fs/nfs/inode.c                            |   4 +
 fs/nfs/internal.h                         |  60 +-
 fs/nfs/localio.c                          | 827 ++++++++++++++++++++++
 fs/nfs/nfs4xdr.c                          |  13 -
 fs/nfs/nfstrace.h                         |  61 ++
 fs/nfs/pagelist.c                         |  32 +-
 fs/nfs/pnfs.c                             |  24 +-
 fs/nfs/pnfs.h                             |   6 +-
 fs/nfs/pnfs_nfs.c                         |   2 +-
 fs/nfs/write.c                            |  13 +-
 fs/nfs_common/Makefile                    |   3 +
 fs/nfs_common/nfslocalio.c                |  74 ++
 fs/nfsd/Kconfig                           |  14 +
 fs/nfsd/Makefile                          |   1 +
 fs/nfsd/filecache.c                       |   2 +-
 fs/nfsd/localio.c                         | 329 +++++++++
 fs/nfsd/netns.h                           |  12 +-
 fs/nfsd/nfsctl.c                          |   2 +-
 fs/nfsd/nfsd.h                            |   2 +-
 fs/nfsd/nfssvc.c                          | 116 ++-
 fs/nfsd/trace.h                           |   3 +-
 fs/nfsd/vfs.h                             |   9 +
 include/linux/nfs.h                       |   9 +
 include/linux/nfs_fs.h                    |   2 +
 include/linux/nfs_fs_sb.h                 |  10 +
 include/linux/nfs_xdr.h                   |  20 +-
 include/linux/nfslocalio.h                |  42 ++
 include/linux/sunrpc/auth.h               |   4 +
 include/linux/sunrpc/svc.h                |   7 +-
 net/sunrpc/auth.c                         |  15 +
 net/sunrpc/clnt.c                         |   1 -
 net/sunrpc/svc.c                          |  68 +-
 net/sunrpc/svc_xprt.c                     |   2 +-
 net/sunrpc/svcauth_unix.c                 |   3 +-
 44 files changed, 1986 insertions(+), 135 deletions(-)
 create mode 100644 Documentation/filesystems/nfs/localio.rst
 create mode 100644 fs/nfs/localio.c
 create mode 100644 fs/nfs_common/nfslocalio.c
 create mode 100644 fs/nfsd/localio.c
 create mode 100644 include/linux/nfslocalio.h

-- 
2.44.0

.

Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailing-List: linux-nfs@vger.kernel.org
List-Id: <linux-nfs.vger.kernel.org>
List-Subscribe: <mailto:linux-nfs+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-nfs+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
From: "NeilBrown" <neilb@suse.de>
To: "Jeff Layton" <jlayton@kernel.org>
Cc: "Chuck Lever" <chuck.lever@oracle.com>,
 "Mike Snitzer" <snitzer@kernel.org>, linux-nfs@vger.kernel.org,
 "Anna Schumaker" <anna@kernel.org>,
 "Trond Myklebust" <trondmy@hammerspace.com>, snitzer@hammerspace.com
Subject: Re: [PATCH v9 13/19] nfsd: add "localio" support
In-reply-to: <de0cd43fe008c32bfe6e3c983256862fb5ffb9c6.camel@kernel.org>
References: <>, <de0cd43fe008c32bfe6e3c983256862fb5ffb9c6.camel@kernel.org>
Date: Mon, 01 Jul 2024 07:54:16 +1000
Message-id: <171978445670.16071.1689758767313847463@noble.neil.brown.name>
Xref: photonic.trudheim.com org.kernel.vger.linux-nfs:86425
Newsgroups: org.kernel.vger.linux-nfs
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

On Mon, 01 Jul 2024, Jeff Layton wrote:
> On Sun, 2024-06-30 at 15:55 -0400, Chuck Lever wrote:
> > On Sun, Jun 30, 2024 at 03:52:51PM -0400, Jeff Layton wrote:
> > > On Sun, 2024-06-30 at 15:44 -0400, Mike Snitzer wrote:
> > > > On Sun, Jun 30, 2024 at 10:49:51AM -0400, Chuck Lever wrote:
> > > > > On Sat, Jun 29, 2024 at 06:18:42PM -0400, Chuck Lever wrote:
> > > > > > > +
> > > > > > > +	/* nfs_fh -> svc_fh */
> > > > > > > +	if (nfs_fh->size > NFS4_FHSIZE) {
> > > > > > > +		status =3D -EINVAL;
> > > > > > > +		goto out;
> > > > > > > +	}
> > > > > > > +	fh_init(&fh, NFS4_FHSIZE);
> > > > > > > +	fh.fh_handle.fh_size =3D nfs_fh->size;
> > > > > > > +	memcpy(fh.fh_handle.fh_raw, nfs_fh->data, nfs_fh->size);
> > > > > > > +
> > > > > > > +	if (fmode & FMODE_READ)
> > > > > > > +		mayflags |=3D NFSD_MAY_READ;
> > > > > > > +	if (fmode & FMODE_WRITE)
> > > > > > > +		mayflags |=3D NFSD_MAY_WRITE;
> > > > > > > +
> > > > > > > +	beres =3D nfsd_file_acquire(rqstp, &fh, mayflags, &nf);
> > > > > > > +	if (beres) {
> > > > > > > +		status =3D nfs_stat_to_errno(be32_to_cpu(beres));
> > > > > > > +		goto out_fh_put;
> > > > > > > +	}
> > > > > >=20
> > > > > > So I'm wondering whether just calling fh_verify() and then
> > > > > > nfsd_open_verified() would be simpler and/or good enough here. Is
> > > > > > there a strong reason to use the file cache for locally opened
> > > > > > files? Jeff, any thoughts?
> > > > >=20
> > > > > > Will there be writeback ramifications for
> > > > > > doing this? Maybe we need a comment here explaining how these fil=
es
> > > > > > are garbage collected (just an fput by the local I/O client, I wo=
uld
> > > > > > guess).
> > > > >=20
> > > > > OK, going back to this: Since right here we immediately call=20
> > > > >=20
> > > > > 	nfsd_file_put(nf);
> > > > >=20
> > > > > There are no writeback ramifications nor any need to comment about
> > > > > garbage collection. But this still seems like a lot of (possibly
> > > > > unnecessary) overhead for simply obtaining a struct file.
> > > >=20
> > > > Easy enough change, probably best to avoid the filecache but would li=
ke
> > > > to verify with Jeff before switching:
> > > >=20
> > > > diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
> > > > index 1d6508aa931e..85ebf63789fb 100644
> > > > --- a/fs/nfsd/localio.c
> > > > +++ b/fs/nfsd/localio.c
> > > > @@ -197,7 +197,6 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_net,
> > > >         const struct cred *save_cred;
> > > >         struct svc_rqst *rqstp;
> > > >         struct svc_fh fh;
> > > > -       struct nfsd_file *nf;
> > > >         __be32 beres;
> > > >=20
> > > >         if (nfs_fh->size > NFS4_FHSIZE)
> > > > @@ -235,13 +234,12 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_ne=
t,
> > > >         if (fmode & FMODE_WRITE)
> > > >                 mayflags |=3D NFSD_MAY_WRITE;
> > > >=20
> > > > -       beres =3D nfsd_file_acquire(rqstp, &fh, mayflags, &nf);
> > > > +       beres =3D fh_verify(rqstp, &fh, S_IFREG, mayflags);
> > > >         if (beres) {
> > > >                 status =3D nfs_stat_to_errno(be32_to_cpu(beres));
> > > >                 goto out_fh_put;
> > > >         }
> > > > -       *pfilp =3D get_file(nf->nf_file);
> > > > -       nfsd_file_put(nf);
> > > > +       status =3D nfsd_open_verified(rqstp, &fh, mayflags, pfilp);
> > > >  out_fh_put:
> > > >         fh_put(&fh);
> > > >         nfsd_local_fakerqst_destroy(rqstp);
> > > >=20
> > >=20
> > > My suggestion would be to _not_ do this. I think you do want to use the
> > > filecache (mostly for performance reasons).
> >=20
> > But look carefully:
> >=20
> >  -- we're not calling nfsd_file_acquire_gc() here
> >=20
> >  -- we're immediately calling nfsd_file_put() on the returned nf
> >=20
> > There's nothing left in the file cache when nfsd_open_local_fh()
> > returns. Each call here will do a full file open and a full close.
> >=20
> >=20
>=20
> Good point. This should be calling nfsd_file_acquire_gc(), IMO.=20

Or the client could do a v4 style acquire, and not call nfsd_file_put()
until it was done with the file.  I don't see a specific problem with
_gc, but avoiding the heuristic it implies seems best where possible.

NeilBrown
.

Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailing-List: linux-nfs@vger.kernel.org
List-Id: <linux-nfs.vger.kernel.org>
List-Subscribe: <mailto:linux-nfs+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-nfs+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
From: "NeilBrown" <neilb@suse.de>
To: "Jeff Layton" <jlayton@kernel.org>
Cc: "Chuck Lever" <chuck.lever@oracle.com>,
 "Mike Snitzer" <snitzer@kernel.org>, linux-nfs@vger.kernel.org,
 "Anna Schumaker" <anna@kernel.org>,
 "Trond Myklebust" <trondmy@hammerspace.com>, snitzer@hammerspace.com
Subject: Re: [PATCH v9 13/19] nfsd: add "localio" support
In-reply-to: <62ce1426e544778e3c035b26fe8ec7810c43e702.camel@kernel.org>
References: <>, <62ce1426e544778e3c035b26fe8ec7810c43e702.camel@kernel.org>
Date: Mon, 01 Jul 2024 07:56:26 +1000
Message-id: <171978458670.16071.9602917875567248508@noble.neil.brown.name>
Xref: photonic.trudheim.com org.kernel.vger.linux-nfs:86426
Newsgroups: org.kernel.vger.linux-nfs
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

On Mon, 01 Jul 2024, Jeff Layton wrote:
> On Sun, 2024-06-30 at 16:15 -0400, Chuck Lever wrote:
> > On Sun, Jun 30, 2024 at 03:59:58PM -0400, Jeff Layton wrote:
> > > On Sun, 2024-06-30 at 15:55 -0400, Chuck Lever wrote:
> > > > On Sun, Jun 30, 2024 at 03:52:51PM -0400, Jeff Layton wrote:
> > > > > On Sun, 2024-06-30 at 15:44 -0400, Mike Snitzer wrote:
> > > > > > On Sun, Jun 30, 2024 at 10:49:51AM -0400, Chuck Lever wrote:
> > > > > > > On Sat, Jun 29, 2024 at 06:18:42PM -0400, Chuck Lever wrote:
> > > > > > > > > +
> > > > > > > > > +	/* nfs_fh -> svc_fh */
> > > > > > > > > +	if (nfs_fh->size > NFS4_FHSIZE) {
> > > > > > > > > +		status =3D -EINVAL;
> > > > > > > > > +		goto out;
> > > > > > > > > +	}
> > > > > > > > > +	fh_init(&fh, NFS4_FHSIZE);
> > > > > > > > > +	fh.fh_handle.fh_size =3D nfs_fh->size;
> > > > > > > > > +	memcpy(fh.fh_handle.fh_raw, nfs_fh->data, nfs_fh->size);
> > > > > > > > > +
> > > > > > > > > +	if (fmode & FMODE_READ)
> > > > > > > > > +		mayflags |=3D NFSD_MAY_READ;
> > > > > > > > > +	if (fmode & FMODE_WRITE)
> > > > > > > > > +		mayflags |=3D NFSD_MAY_WRITE;
> > > > > > > > > +
> > > > > > > > > +	beres =3D nfsd_file_acquire(rqstp, &fh, mayflags, &nf);
> > > > > > > > > +	if (beres) {
> > > > > > > > > +		status =3D nfs_stat_to_errno(be32_to_cpu(beres));
> > > > > > > > > +		goto out_fh_put;
> > > > > > > > > +	}
> > > > > > > >=20
> > > > > > > > So I'm wondering whether just calling fh_verify() and then
> > > > > > > > nfsd_open_verified() would be simpler and/or good enough here=
. Is
> > > > > > > > there a strong reason to use the file cache for locally opened
> > > > > > > > files? Jeff, any thoughts?
> > > > > > >=20
> > > > > > > > Will there be writeback ramifications for
> > > > > > > > doing this? Maybe we need a comment here explaining how these=
 files
> > > > > > > > are garbage collected (just an fput by the local I/O client, =
I would
> > > > > > > > guess).
> > > > > > >=20
> > > > > > > OK, going back to this: Since right here we immediately call=20
> > > > > > >=20
> > > > > > > 	nfsd_file_put(nf);
> > > > > > >=20
> > > > > > > There are no writeback ramifications nor any need to comment ab=
out
> > > > > > > garbage collection. But this still seems like a lot of (possibly
> > > > > > > unnecessary) overhead for simply obtaining a struct file.
> > > > > >=20
> > > > > > Easy enough change, probably best to avoid the filecache but woul=
d like
> > > > > > to verify with Jeff before switching:
> > > > > >=20
> > > > > > diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
> > > > > > index 1d6508aa931e..85ebf63789fb 100644
> > > > > > --- a/fs/nfsd/localio.c
> > > > > > +++ b/fs/nfsd/localio.c
> > > > > > @@ -197,7 +197,6 @@ int nfsd_open_local_fh(struct net *cl_nfssvc_=
net,
> > > > > >         const struct cred *save_cred;
> > > > > >         struct svc_rqst *rqstp;
> > > > > >         struct svc_fh fh;
> > > > > > -       struct nfsd_file *nf;
> > > > > >         __be32 beres;
> > > > > >=20
> > > > > >         if (nfs_fh->size > NFS4_FHSIZE)
> > > > > > @@ -235,13 +234,12 @@ int nfsd_open_local_fh(struct net *cl_nfssv=
c_net,
> > > > > >         if (fmode & FMODE_WRITE)
> > > > > >                 mayflags |=3D NFSD_MAY_WRITE;
> > > > > >=20
> > > > > > -       beres =3D nfsd_file_acquire(rqstp, &fh, mayflags, &nf);
> > > > > > +       beres =3D fh_verify(rqstp, &fh, S_IFREG, mayflags);
> > > > > >         if (beres) {
> > > > > >                 status =3D nfs_stat_to_errno(be32_to_cpu(beres));
> > > > > >                 goto out_fh_put;
> > > > > >         }
> > > > > > -       *pfilp =3D get_file(nf->nf_file);
> > > > > > -       nfsd_file_put(nf);
> > > > > > +       status =3D nfsd_open_verified(rqstp, &fh, mayflags, pfilp=
);
> > > > > >  out_fh_put:
> > > > > >         fh_put(&fh);
> > > > > >         nfsd_local_fakerqst_destroy(rqstp);
> > > > > >=20
> > > > >=20
> > > > > My suggestion would be to _not_ do this. I think you do want to use=
 the
> > > > > filecache (mostly for performance reasons).
> > > >=20
> > > > But look carefully:
> > > >=20
> > > >  -- we're not calling nfsd_file_acquire_gc() here
> > > >=20
> > > >  -- we're immediately calling nfsd_file_put() on the returned nf
> > > >=20
> > > > There's nothing left in the file cache when nfsd_open_local_fh()
> > > > returns. Each call here will do a full file open and a full close.
> > >=20
> > > Good point. This should be calling nfsd_file_acquire_gc(), IMO.=20
> >=20
> > So that goes to my point yesterday about writeback ramifications.
> >=20
> > If these open files linger in the file cache, then when will they
> > get written back to storage and by whom? Is it going to be an nfsd
> > thread writing them back as part of garbage collection?
> >=20
>=20
> Usually the client is issuing regular COMMITs. If that doesn't happen,
> then the flusher threads should get the rest.
>=20
> Side note: I don't guess COMMIT goes over the localio path yet, does
> it? Maybe it should. It would be nice to not tie up an nfsd thread with
> writeback.

The documentation certainly claims that COMMIT uses the localio path.  I
haven't double checked the code but I'd be very surprised if it didn't.

NeilBrown


>=20
> > So, you're saying that the local I/O client will always behave like
> > NFSv3 in this regard, and open/read/close, open/write/close instead
> > of hanging on to the open file? That seems... suboptimal... and not
> > expected for a local file. That needs to be documented in the
> > LOCALIO design doc.
> >=20
>=20
> I imagine so, which is why I suggest using the filecache. If we get one
> READ or WRITE for the file via localio, we're pretty likely to get
> more. Why not amortize that file open over several operations?
> =20
> > I'm also concerned about local applications closing a file but
> > having an open file handle linger in the file cache -- that can
> > prevent other accesses to the file until the GC ejects that open
> > file, as we've seen in the field.
> >=20
> > IMHO nfsd_file_acquire_gc() is going to have some unwanted side
> > effects.
> >=20
>=20
> Typically, the client issues COMMIT calls when the client-side fd is
> closed (for CTO). While I think we do need to be able to deal with
> flushing files with dirty data that are left "hanging", that shouldn't
> be the common case. Most of the time, the client is going to be issuing
> regular COMMITs so that it can clean its pages.
>=20
> IOW, I don't see how localio is any different than the case of normal
> v3 IO in this respect.
> --=20
> Jeff Layton <jlayton@kernel.org>
>=20

.

Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailing-List: linux-nfs@vger.kernel.org
List-Id: <linux-nfs.vger.kernel.org>
List-Subscribe: <mailto:linux-nfs+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-nfs+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
From: "NeilBrown" <neilb@suse.de>
To: "Chuck Lever" <chuck.lever@oracle.com>
Cc: "Mike Snitzer" <snitzer@kernel.org>, linux-nfs@vger.kernel.org,
 "Jeff Layton" <jlayton@kernel.org>, "Anna Schumaker" <anna@kernel.org>,
 "Trond Myklebust" <trondmy@hammerspace.com>, snitzer@hammerspace.com
Subject: Re: [PATCH v9 18/19] SUNRPC: replace program list with program array
In-reply-to: <ZoAvgLhX+fOdoXXU@tissot.1015granger.net>
References: <>, <ZoAvgLhX+fOdoXXU@tissot.1015granger.net>
Date: Mon, 01 Jul 2024 07:57:25 +1000
Message-id: <171978464512.16071.9345065038530000024@noble.neil.brown.name>
Xref: photonic.trudheim.com org.kernel.vger.linux-nfs:86427
Newsgroups: org.kernel.vger.linux-nfs
Path: photonic.trudheim.com!nntp.lore.kernel.org!not-for-mail

On Sun, 30 Jun 2024, Chuck Lever wrote:
> On Fri, Jun 28, 2024 at 05:11:04PM -0400, Mike Snitzer wrote:
> > From: NeilBrown <neil@brown.name>
> >=20
> > A service created with svc_create_pooled() can be given a linked list of
> > programs and all of these will be served.
> >=20
> > Using a linked list makes it cumbersome when there are several programs
> > that can be optionally selected with CONFIG settings.
> >=20
> > So change to use an array with explicit size.  svc_create() is always
> > passed a single program.  svc_create_pooled() now must be used for
> > multiple programs.
>=20
> Instead of this last sentence, it might be more clear to say:
>=20
> > After this patch is applied, API consumers must use only
> > svc_create_pooled() when creating an RPC service that listens for
> > more than one RPC program.

Thanks - that's a much clearer way to say it.

NeilBrown

>=20
> I like the idea of replacing these static linked lists.
>=20
>=20
> > Signed-off-by: NeilBrown <neil@brown.name>
> > Signed-off-by: Mike Snitzer <snitzer@kernel.org>
> > ---
> >  fs/nfsd/nfsctl.c           |  2 +-
> >  fs/nfsd/nfsd.h             |  2 +-
> >  fs/nfsd/nfssvc.c           | 69 ++++++++++++++++++--------------------
> >  include/linux/sunrpc/svc.h |  7 ++--
> >  net/sunrpc/svc.c           | 68 +++++++++++++++++++++----------------
> >  net/sunrpc/svc_xprt.c      |  2 +-
> >  net/sunrpc/svcauth_unix.c  |  3 +-
> >  7 files changed, 80 insertions(+), 73 deletions(-)
> >=20
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index e5d2cc74ef77..6fb92bb61c6d 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -2265,7 +2265,7 @@ static __net_init int nfsd_net_init(struct net *net)
> >  	if (retval)
> >  		goto out_repcache_error;
> >  	memset(&nn->nfsd_svcstats, 0, sizeof(nn->nfsd_svcstats));
> > -	nn->nfsd_svcstats.program =3D &nfsd_program;
> > +	nn->nfsd_svcstats.program =3D &nfsd_programs[0];
> >  	nn->nfsd_versions =3D NULL;
> >  	nn->nfsd4_minorversions =3D NULL;
> >  	nfsd4_init_leases_net(nn);
> > diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> > index cec8697b1cd6..c3f7c5957950 100644
> > --- a/fs/nfsd/nfsd.h
> > +++ b/fs/nfsd/nfsd.h
> > @@ -80,7 +80,7 @@ struct nfsd_genl_rqstp {
> >  	u32			rq_opnum[NFSD_MAX_OPS_PER_COMPOUND];
> >  };
> > =20
> > -extern struct svc_program	nfsd_program;
> > +extern struct svc_program	nfsd_programs[];
> >  extern const struct svc_version	nfsd_version2, nfsd_version3, nfsd_versi=
on4;
> >  extern struct mutex		nfsd_mutex;
> >  extern spinlock_t		nfsd_drc_lock;
> > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > index 6cc6a1971e21..ef2532303ece 100644
> > --- a/fs/nfsd/nfssvc.c
> > +++ b/fs/nfsd/nfssvc.c
> > @@ -36,7 +36,6 @@
> >  #define NFSDDBG_FACILITY	NFSDDBG_SVC
> > =20
> >  atomic_t			nfsd_th_cnt =3D ATOMIC_INIT(0);
> > -extern struct svc_program	nfsd_program;
> >  static int			nfsd(void *vrqstp);
> >  #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
> >  static int			nfsd_acl_rpcbind_set(struct net *,
> > @@ -89,16 +88,6 @@ static const struct svc_version *localio_versions[] =
=3D {
> > =20
> >  #define NFSD_LOCALIO_NRVERS		ARRAY_SIZE(localio_versions)
> > =20
> > -static struct svc_program	nfsd_localio_program =3D {
> > -	.pg_prog		=3D NFS_LOCALIO_PROGRAM,
> > -	.pg_nvers		=3D NFSD_LOCALIO_NRVERS,
> > -	.pg_vers		=3D localio_versions,
> > -	.pg_name		=3D "nfslocalio",
> > -	.pg_class		=3D "nfsd",
> > -	.pg_authenticate	=3D &svc_set_client,
> > -	.pg_init_request	=3D svc_generic_init_request,
> > -	.pg_rpcbind_set		=3D svc_generic_rpcbind_set,
> > -};
> >  #endif /* CONFIG_NFSD_LOCALIO */
> > =20
> >  #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
> > @@ -111,23 +100,9 @@ static const struct svc_version *nfsd_acl_version[] =
=3D {
> >  # endif
> >  };
> > =20
> > -#define NFSD_ACL_MINVERS            2
> > +#define NFSD_ACL_MINVERS	2
> >  #define NFSD_ACL_NRVERS		ARRAY_SIZE(nfsd_acl_version)
> > =20
> > -static struct svc_program	nfsd_acl_program =3D {
> > -#if IS_ENABLED(CONFIG_NFSD_LOCALIO)
> > -	.pg_next		=3D &nfsd_localio_program,
> > -#endif /* CONFIG_NFSD_LOCALIO */
> > -	.pg_prog		=3D NFS_ACL_PROGRAM,
> > -	.pg_nvers		=3D NFSD_ACL_NRVERS,
> > -	.pg_vers		=3D nfsd_acl_version,
> > -	.pg_name		=3D "nfsacl",
> > -	.pg_class		=3D "nfsd",
> > -	.pg_authenticate	=3D &svc_set_client,
> > -	.pg_init_request	=3D nfsd_acl_init_request,
> > -	.pg_rpcbind_set		=3D nfsd_acl_rpcbind_set,
> > -};
> > -
> >  #endif /* defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) */
> > =20
> >  static const struct svc_version *nfsd_version[] =3D {
> > @@ -140,25 +115,44 @@ static const struct svc_version *nfsd_version[] =3D=
 {
> >  #endif
> >  };
> > =20
> > -#define NFSD_MINVERS    	2
> > +#define NFSD_MINVERS		2
> >  #define NFSD_NRVERS		ARRAY_SIZE(nfsd_version)
> > =20
> > -struct svc_program		nfsd_program =3D {
> > -#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
> > -	.pg_next		=3D &nfsd_acl_program,
> > -#else
> > -#if IS_ENABLED(CONFIG_NFSD_LOCALIO)
> > -	.pg_next		=3D &nfsd_localio_program,
> > -#endif /* CONFIG_NFSD_LOCALIO */
> > -#endif
> > +struct svc_program		nfsd_programs[] =3D {
> > +	{
> >  	.pg_prog		=3D NFS_PROGRAM,		/* program number */
> >  	.pg_nvers		=3D NFSD_NRVERS,		/* nr of entries in nfsd_version */
> >  	.pg_vers		=3D nfsd_version,		/* version table */
> >  	.pg_name		=3D "nfsd",		/* program name */
> >  	.pg_class		=3D "nfsd",		/* authentication class */
> > -	.pg_authenticate	=3D &svc_set_client,	/* export authentication */
> > +	.pg_authenticate	=3D svc_set_client,	/* export authentication */
> >  	.pg_init_request	=3D nfsd_init_request,
> >  	.pg_rpcbind_set		=3D nfsd_rpcbind_set,
> > +	},
> > +#if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL)
> > +	{
> > +	.pg_prog		=3D NFS_ACL_PROGRAM,
> > +	.pg_nvers		=3D NFSD_ACL_NRVERS,
> > +	.pg_vers		=3D nfsd_acl_version,
> > +	.pg_name		=3D "nfsacl",
> > +	.pg_class		=3D "nfsd",
> > +	.pg_authenticate	=3D svc_set_client,
> > +	.pg_init_request	=3D nfsd_acl_init_request,
> > +	.pg_rpcbind_set		=3D nfsd_acl_rpcbind_set,
> > +	},
> > +#endif /* defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) */
> > +#if IS_ENABLED(CONFIG_NFSD_LOCALIO)
> > +	{
> > +	.pg_prog		=3D NFS_LOCALIO_PROGRAM,
> > +	.pg_nvers		=3D NFSD_LOCALIO_NRVERS,
> > +	.pg_vers		=3D localio_versions,
> > +	.pg_name		=3D "nfslocalio",
> > +	.pg_class		=3D "nfsd",
> > +	.pg_authenticate	=3D svc_set_client,
> > +	.pg_init_request	=3D svc_generic_init_request,
> > +	.pg_rpcbind_set		=3D svc_generic_rpcbind_set,
> > +	}
> > +#endif /* IS_ENABLED(CONFIG_NFSD_LOCALIO) */
> >  };
> > =20
> >  bool nfsd_support_version(int vers)
> > @@ -735,7 +729,8 @@ int nfsd_create_serv(struct net *net)
> >  	if (nfsd_max_blksize =3D=3D 0)
> >  		nfsd_max_blksize =3D nfsd_get_default_max_blksize();
> >  	nfsd_reset_versions(nn);
> > -	serv =3D svc_create_pooled(&nfsd_program, &nn->nfsd_svcstats,
> > +	serv =3D svc_create_pooled(nfsd_programs, ARRAY_SIZE(nfsd_programs),
> > +				 &nn->nfsd_svcstats,
> >  				 nfsd_max_blksize, nfsd);
> >  	if (serv =3D=3D NULL)
> >  		return -ENOMEM;
> > diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> > index a7d0406b9ef5..7c86b1696398 100644
> > --- a/include/linux/sunrpc/svc.h
> > +++ b/include/linux/sunrpc/svc.h
> > @@ -66,9 +66,10 @@ enum {
> >   * We currently do not support more than one RPC program per daemon.
> >   */
> >  struct svc_serv {
> > -	struct svc_program *	sv_program;	/* RPC program */
> > +	struct svc_program *	sv_programs;	/* RPC programs */
> >  	struct svc_stat *	sv_stats;	/* RPC statistics */
> >  	spinlock_t		sv_lock;
> > +	unsigned int		sv_nprogs;	/* Number of sv_programs */
> >  	unsigned int		sv_nrthreads;	/* # of server threads */
> >  	unsigned int		sv_maxconn;	/* max connections allowed or
> >  						 * '0' causing max to be based
> > @@ -329,10 +330,9 @@ struct svc_process_info {
> >  };
> > =20
> >  /*
> > - * List of RPC programs on the same transport endpoint
> > + * RPC program - an array of these can use the same transport endpoint
> >   */
> >  struct svc_program {
> > -	struct svc_program *	pg_next;	/* other programs (same xprt) */
> >  	u32			pg_prog;	/* program number */
> >  	unsigned int		pg_lovers;	/* lowest version */
> >  	unsigned int		pg_hivers;	/* highest version */
> > @@ -414,6 +414,7 @@ void		   svc_rqst_release_pages(struct svc_rqst *rqst=
p);
> >  void		   svc_rqst_free(struct svc_rqst *);
> >  void		   svc_exit_thread(struct svc_rqst *);
> >  struct svc_serv *  svc_create_pooled(struct svc_program *prog,
> > +				     unsigned int nprog,
> >  				     struct svc_stat *stats,
> >  				     unsigned int bufsize,
> >  				     int (*threadfn)(void *data));
> > diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> > index 965a27806bfd..d9f348aa0672 100644
> > --- a/net/sunrpc/svc.c
> > +++ b/net/sunrpc/svc.c
> > @@ -440,10 +440,11 @@ EXPORT_SYMBOL_GPL(svc_rpcb_cleanup);
> > =20
> >  static int svc_uses_rpcbind(struct svc_serv *serv)
> >  {
> > -	struct svc_program	*progp;
> > -	unsigned int		i;
> > +	unsigned int		p, i;
> > +
> > +	for (p =3D 0; p < serv->sv_nprogs; p++) {
> > +		struct svc_program *progp =3D &serv->sv_programs[p];
> > =20
> > -	for (progp =3D serv->sv_program; progp; progp =3D progp->pg_next) {
> >  		for (i =3D 0; i < progp->pg_nvers; i++) {
> >  			if (progp->pg_vers[i] =3D=3D NULL)
> >  				continue;
> > @@ -480,7 +481,7 @@ __svc_init_bc(struct svc_serv *serv)
> >   * Create an RPC service
> >   */
> >  static struct svc_serv *
> > -__svc_create(struct svc_program *prog, struct svc_stat *stats,
> > +__svc_create(struct svc_program *prog, int nprogs, struct svc_stat *stat=
s,
> >  	     unsigned int bufsize, int npools, int (*threadfn)(void *data))
> >  {
> >  	struct svc_serv	*serv;
> > @@ -491,7 +492,8 @@ __svc_create(struct svc_program *prog, struct svc_sta=
t *stats,
> >  	if (!(serv =3D kzalloc(sizeof(*serv), GFP_KERNEL)))
> >  		return NULL;
> >  	serv->sv_name      =3D prog->pg_name;
> > -	serv->sv_program   =3D prog;
> > +	serv->sv_programs  =3D prog;
> > +	serv->sv_nprogs    =3D nprogs;
> >  	serv->sv_stats     =3D stats;
> >  	if (bufsize > RPCSVC_MAXPAYLOAD)
> >  		bufsize =3D RPCSVC_MAXPAYLOAD;
> > @@ -499,17 +501,18 @@ __svc_create(struct svc_program *prog, struct svc_s=
tat *stats,
> >  	serv->sv_max_mesg  =3D roundup(serv->sv_max_payload + PAGE_SIZE, PAGE_S=
IZE);
> >  	serv->sv_threadfn =3D threadfn;
> >  	xdrsize =3D 0;
> > -	while (prog) {
> > -		prog->pg_lovers =3D prog->pg_nvers-1;
> > -		for (vers=3D0; vers<prog->pg_nvers ; vers++)
> > -			if (prog->pg_vers[vers]) {
> > -				prog->pg_hivers =3D vers;
> > -				if (prog->pg_lovers > vers)
> > -					prog->pg_lovers =3D vers;
> > -				if (prog->pg_vers[vers]->vs_xdrsize > xdrsize)
> > -					xdrsize =3D prog->pg_vers[vers]->vs_xdrsize;
> > +	for (i =3D 0; i < nprogs; i++) {
> > +		struct svc_program *progp =3D &prog[i];
> > +
> > +		progp->pg_lovers =3D progp->pg_nvers-1;
> > +		for (vers =3D 0; vers < progp->pg_nvers ; vers++)
> > +			if (progp->pg_vers[vers]) {
> > +				progp->pg_hivers =3D vers;
> > +				if (progp->pg_lovers > vers)
> > +					progp->pg_lovers =3D vers;
> > +				if (progp->pg_vers[vers]->vs_xdrsize > xdrsize)
> > +					xdrsize =3D progp->pg_vers[vers]->vs_xdrsize;
> >  			}
> > -		prog =3D prog->pg_next;
> >  	}
> >  	serv->sv_xdrsize   =3D xdrsize;
> >  	INIT_LIST_HEAD(&serv->sv_tempsocks);
> > @@ -558,13 +561,14 @@ __svc_create(struct svc_program *prog, struct svc_s=
tat *stats,
> >  struct svc_serv *svc_create(struct svc_program *prog, unsigned int bufsi=
ze,
> >  			    int (*threadfn)(void *data))
> >  {
> > -	return __svc_create(prog, NULL, bufsize, 1, threadfn);
> > +	return __svc_create(prog, 1, NULL, bufsize, 1, threadfn);
> >  }
> >  EXPORT_SYMBOL_GPL(svc_create);
> > =20
> >  /**
> >   * svc_create_pooled - Create an RPC service with pooled threads
> > - * @prog: the RPC program the new service will handle
> > + * @prog:  Array of RPC programs the new service will handle
> > + * @nprogs: Number of programs in the array
> >   * @stats: the stats struct if desired
> >   * @bufsize: maximum message size for @prog
> >   * @threadfn: a function to service RPC requests for @prog
> > @@ -572,6 +576,7 @@ EXPORT_SYMBOL_GPL(svc_create);
> >   * Returns an instantiated struct svc_serv object or NULL.
> >   */
> >  struct svc_serv *svc_create_pooled(struct svc_program *prog,
> > +				   unsigned int nprogs,
> >  				   struct svc_stat *stats,
> >  				   unsigned int bufsize,
> >  				   int (*threadfn)(void *data))
> > @@ -579,7 +584,7 @@ struct svc_serv *svc_create_pooled(struct svc_program=
 *prog,
> >  	struct svc_serv *serv;
> >  	unsigned int npools =3D svc_pool_map_get();
> > =20
> > -	serv =3D __svc_create(prog, stats, bufsize, npools, threadfn);
> > +	serv =3D __svc_create(prog, nprogs, stats, bufsize, npools, threadfn);
> >  	if (!serv)
> >  		goto out_err;
> >  	serv->sv_is_pooled =3D true;
> > @@ -602,16 +607,16 @@ svc_destroy(struct svc_serv **servp)
> > =20
> >  	*servp =3D NULL;
> > =20
> > -	dprintk("svc: svc_destroy(%s)\n", serv->sv_program->pg_name);
> > +	dprintk("svc: svc_destroy(%s)\n", serv->sv_programs->pg_name);
> >  	timer_shutdown_sync(&serv->sv_temptimer);
> > =20
> >  	/*
> >  	 * Remaining transports at this point are not expected.
> >  	 */
> >  	WARN_ONCE(!list_empty(&serv->sv_permsocks),
> > -		  "SVC: permsocks remain for %s\n", serv->sv_program->pg_name);
> > +		  "SVC: permsocks remain for %s\n", serv->sv_programs->pg_name);
> >  	WARN_ONCE(!list_empty(&serv->sv_tempsocks),
> > -		  "SVC: tempsocks remain for %s\n", serv->sv_program->pg_name);
> > +		  "SVC: tempsocks remain for %s\n", serv->sv_programs->pg_name);
> > =20
> >  	cache_clean_deferred(serv);
> > =20
> > @@ -1156,15 +1161,16 @@ int svc_register(const struct svc_serv *serv, str=
uct net *net,
> >  		 const int family, const unsigned short proto,
> >  		 const unsigned short port)
> >  {
> > -	struct svc_program	*progp;
> > -	unsigned int		i;
> > +	unsigned int		p, i;
> >  	int			error =3D 0;
> > =20
> >  	WARN_ON_ONCE(proto =3D=3D 0 && port =3D=3D 0);
> >  	if (proto =3D=3D 0 && port =3D=3D 0)
> >  		return -EINVAL;
> > =20
> > -	for (progp =3D serv->sv_program; progp; progp =3D progp->pg_next) {
> > +	for (p =3D 0; p < serv->sv_nprogs; p++) {
> > +		struct svc_program *progp =3D &serv->sv_programs[p];
> > +
> >  		for (i =3D 0; i < progp->pg_nvers; i++) {
> > =20
> >  			error =3D progp->pg_rpcbind_set(net, progp, i,
> > @@ -1216,13 +1222,14 @@ static void __svc_unregister(struct net *net, con=
st u32 program, const u32 versi
> >  static void svc_unregister(const struct svc_serv *serv, struct net *net)
> >  {
> >  	struct sighand_struct *sighand;
> > -	struct svc_program *progp;
> >  	unsigned long flags;
> > -	unsigned int i;
> > +	unsigned int p, i;
> > =20
> >  	clear_thread_flag(TIF_SIGPENDING);
> > =20
> > -	for (progp =3D serv->sv_program; progp; progp =3D progp->pg_next) {
> > +	for (p =3D 0; p < serv->sv_nprogs; p++) {
> > +		struct svc_program *progp =3D &serv->sv_programs[p];
> > +
> >  		for (i =3D 0; i < progp->pg_nvers; i++) {
> >  			if (progp->pg_vers[i] =3D=3D NULL)
> >  				continue;
> > @@ -1328,7 +1335,7 @@ svc_process_common(struct svc_rqst *rqstp)
> >  	struct svc_process_info process;
> >  	enum svc_auth_status	auth_res;
> >  	unsigned int		aoffset;
> > -	int			rc;
> > +	int			pr, rc;
> >  	__be32			*p;
> > =20
> >  	/* Will be turned off only when NFSv4 Sessions are used */
> > @@ -1352,9 +1359,12 @@ svc_process_common(struct svc_rqst *rqstp)
> >  	rqstp->rq_vers =3D be32_to_cpup(p++);
> >  	rqstp->rq_proc =3D be32_to_cpup(p);
> > =20
> > -	for (progp =3D serv->sv_program; progp; progp =3D progp->pg_next)
> > +	for (pr =3D 0; pr < serv->sv_nprogs; pr++) {
> > +		progp =3D &serv->sv_programs[pr];
> > +
> >  		if (rqstp->rq_prog =3D=3D progp->pg_prog)
> >  			break;
> > +	}
> > =20
> >  	/*
> >  	 * Decode auth data, and add verifier to reply buffer.
> > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > index d3735ab3e6d1..16634afdf253 100644
> > --- a/net/sunrpc/svc_xprt.c
> > +++ b/net/sunrpc/svc_xprt.c
> > @@ -268,7 +268,7 @@ static int _svc_xprt_create(struct svc_serv *serv, co=
nst char *xprt_name,
> >  		spin_unlock(&svc_xprt_class_lock);
> >  		newxprt =3D xcl->xcl_ops->xpo_create(serv, net, sap, len, flags);
> >  		if (IS_ERR(newxprt)) {
> > -			trace_svc_xprt_create_err(serv->sv_program->pg_name,
> > +			trace_svc_xprt_create_err(serv->sv_programs->pg_name,
> >  						  xcl->xcl_name, sap, len,
> >  						  newxprt);
> >  			module_put(xcl->xcl_owner);
> > diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> > index 04b45588ae6f..8ca98b146ec8 100644
> > --- a/net/sunrpc/svcauth_unix.c
> > +++ b/net/sunrpc/svcauth_unix.c
> > @@ -697,7 +697,8 @@ svcauth_unix_set_client(struct svc_rqst *rqstp)
> >  	rqstp->rq_auth_stat =3D rpc_autherr_badcred;
> >  	ipm =3D ip_map_cached_get(xprt);
> >  	if (ipm =3D=3D NULL)
> > -		ipm =3D __ip_map_lookup(sn->ip_map_cache, rqstp->rq_server->sv_program=
->pg_class,
> > +		ipm =3D __ip_map_lookup(sn->ip_map_cache,
> > +				      rqstp->rq_server->sv_programs->pg_class,
> >  				    &sin6->sin6_addr);
> > =20
> >  	if (ipm =3D=3D NULL)
> > --=20
> > 2.44.0
> >=20
> >=20
>=20
> --=20
> Chuck Lever
>=20

.

