From: Zhiping Zhang <zhipingz@meta.com>
To: Alex Williamson <alex@shazbot.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>, Leon Romanovsky <leon@kernel.org>,
Bjorn Helgaas <helgaas@kernel.org>,
kvm@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-pci@vger.kernel.org, netdev@vger.kernel.org,
dri-devel@lists.freedesktop.org, Keith Busch <kbusch@kernel.org>,
Yochai Cohen <yochai@nvidia.com>,
Yishai Hadas <yishaih@nvidia.com>
Subject: Re: [PATCH v4 1/3] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature
Date: Fri, 22 May 2026 18:03:57 -0700 [thread overview]
Message-ID: <CAH3zFs1gh6UZewMf9GJ6CtO5v0yWNokp-eZQ8G2QnyAyxD7-wQ@mail.gmail.com> (raw)
In-Reply-To: <20260521162437.406085db@shazbot.org>
On Thu, May 21, 2026 at 3:24 PM Alex Williamson <alex@shazbot.org> wrote:
>
> >
> On Thu, 21 May 2026 16:04:12 -0600
> Alex Williamson <alex@shazbot.org> wrote:
>
> > On Tue, 19 May 2026 13:13:49 -0700
> > Zhiping Zhang <zhipingz@meta.com> wrote:
> >
> > > Add a dma-buf get_tph callback for exporters to return TPH
> > > (TLP Processing Hints) metadata, and add VFIO_DEVICE_FEATURE_DMA_BUF_TPH
> > > so userspace can attach that metadata to a VFIO-exported dma-buf.
> >
> > This should be two patches, the first extending the dma-buf framework
> > for the get_tph callback for explicit approval from dma-buf maintainers
> > (who are not even copied here). The second the vfio-pci implementation
> > of get_tph.
> >
> > > 8-bit ST and 16-bit Extended ST are distinct PCIe TPH namespaces; the
> > > uAPI carries both with explicit validity flags so importers get the
> > > value matching their requested width. SET is write-once per dma-buf;
> > > the existing VFIO_DEVICE_FEATURE_DMA_BUF uAPI is unchanged.
> >
> > I didn't see what motivated this write-once change, I thought we
> > understood that it was a userspace problem that the tph values need to
> > be set before providing the dma-buf fd to the importer and that races
> > relative to that are a userspace ordering problem. Write-once seems
> > unnecessarily restrictive and there's no justification provided here.
> >
> > > Signed-off-by: Zhiping Zhang <zhipingz@meta.com>
> > > ---
> > > drivers/vfio/pci/vfio_pci_core.c | 3 +
> > > drivers/vfio/pci/vfio_pci_dmabuf.c | 134 +++++++++++++++++++++++++++--
> > > drivers/vfio/pci/vfio_pci_priv.h | 12 +++
> > > include/linux/dma-buf.h | 21 +++++
> > > include/uapi/linux/vfio.h | 35 ++++++++
> > > 5 files changed, 198 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> > > index 3f8d093aacf8..94aa6dd95701 100644
> > > --- a/drivers/vfio/pci/vfio_pci_core.c
> > > +++ b/drivers/vfio/pci/vfio_pci_core.c
> > > @@ -1534,6 +1534,9 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
> > > return vfio_pci_core_feature_token(vdev, flags, arg, argsz);
> > > case VFIO_DEVICE_FEATURE_DMA_BUF:
> > > return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz);
> > > + case VFIO_DEVICE_FEATURE_DMA_BUF_TPH:
> > > + return vfio_pci_core_feature_dma_buf_tph(vdev, flags, arg,
> > > + argsz);
> > > default:
> > > return -ENOTTY;
> > > }
> > > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > > index f87fd32e4a01..be1c65385670 100644
> > > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > > @@ -19,7 +19,24 @@ struct vfio_pci_dma_buf {
> > > u32 nr_ranges;
> > > struct kref kref;
> > > struct completion comp;
> > > - u8 revoked : 1;
> > > + /*
> > > + * TPH metadata published by VFIO_DEVICE_FEATURE_DMA_BUF_TPH and
> > > + * consumed by the @get_tph dma-buf callback.
> > > + *
> > > + * @tph_flags is the publish/consume gate: writers populate
> > > + * @steering_tag, @steering_tag_ext and @ph first, then store
> > > + * @tph_flags with smp_store_release(); readers do
> > > + * smp_load_acquire(&tph_flags) before accessing the value fields.
> > > + * @tph_flags == 0 means "TPH not set". Writers publish a non-zero
> > > + * value only once per dma-buf and serialize via vdev->memory_lock;
> > > + * readers stay lockless to avoid AB-BA against the dma_resv_lock held
> > > + * by importers.
> > > + */
> >
> > Can you outline the ABBA hazard, I'm not seeing it. You're acquiring
> > memory_lock in the feature SET and dma_resv_lock doesn't appear to be
> > held when calling .get_tph(). There's a lot of lockless complication
> > here balanced on this claim of avoiding a hazard that doesn't appear
> > present.
> >
> > > + u32 tph_flags;
> > > + u16 steering_tag_ext;
> > > + u8 steering_tag;
> > > + u8 ph;
> > > + bool revoked;
> >
> > If we still used memory_lock for tph, these could be:
> >
> > u8 tph_st_valid:1; /* memory_lock */
> > u8 tph_st_ext_valid:1; /* memory_lock */
> > u8 tph_ph:2; /* memory_lock */
> > u8 tph_st;
> > u16 tph_st_ext;
> > u8 revoked:1; /* dma_resv_lock */
> >
> > The existing change of @revoked from bitfield to bool has no rationale
> > noted for it in the commit log.
>
> On second thought, what dependency does anything here have on
> memory_lock? I think we're jumping through hoops to avoid a lock we
> don't even need. If we just want to serialize SET vs get_tph we could
> have a mutex on the dma-buf structure, or use RCU if we want to manage
> it locklessly and make sure get_tph always sees a fully consistent set
> of values. Thanks,
>
> Alex
Agreed, we don't need memory_lock in this path. For v5 I'll instead add a
struct mutex lock to struct vfio_pci_dma_buf and take it in SET,
get_tph,
and around the priv->vdev = NULL store in cleanup.
Thanks,
Zhiping
next prev parent reply other threads:[~2026-05-23 1:04 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-19 20:13 [PATCH v4 0/3] vfio/dma-buf: add TPH support for peer-to-peer access Zhiping Zhang
2026-05-19 20:13 ` [PATCH v4 1/3] vfio: add dma-buf get_tph callback and DMA_BUF_TPH feature Zhiping Zhang
2026-05-21 22:04 ` Alex Williamson
2026-05-21 22:24 ` Alex Williamson
2026-05-23 1:03 ` Zhiping Zhang [this message]
2026-05-22 23:53 ` Zhiping Zhang
2026-05-25 12:27 ` Claude review: " Claude Code Review Bot
2026-05-19 20:13 ` [PATCH v4 2/3] PCI/TPH: expose the enabled TPH requester type Zhiping Zhang
2026-05-25 12:27 ` Claude review: " Claude Code Review Bot
2026-05-19 20:13 ` [PATCH v4 3/3] RDMA/mlx5: get tph for p2p access when registering dma-buf mr Zhiping Zhang
2026-05-25 12:27 ` Claude review: " Claude Code Review Bot
2026-05-25 12:27 ` Claude review: vfio/dma-buf: add TPH support for peer-to-peer access Claude Code Review Bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAH3zFs1gh6UZewMf9GJ6CtO5v0yWNokp-eZQ8G2QnyAyxD7-wQ@mail.gmail.com \
--to=zhipingz@meta.com \
--cc=alex@shazbot.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=helgaas@kernel.org \
--cc=jgg@ziepe.ca \
--cc=kbusch@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=yishaih@nvidia.com \
--cc=yochai@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox