From mboxrd@z Thu Jan 1 00:00:00 1970 From: Claude Code Review Bot To: dri-devel-reviews@example.com Subject: Claude review: vfio/iommufd: use iova_to_phys_length for efficient unmap Date: Thu, 04 Jun 2026 15:00:35 +1000 Message-ID: In-Reply-To: <20260531093637.3893199-8-guanghuifeng@linux.alibaba.com> References: <20260531093637.3893199-1-guanghuifeng@linux.alibaba.com> <20260531093637.3893199-8-guanghuifeng@linux.alibaba.com> X-Mailer: Claude Code Patch Reviewer Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit MIME-Version: 1.0 Patch Review This is the key performance patch. It has the most complex logic changes. **iommufd/pages.c -- npages calculation bug:** ```c npages = (pgsize - (iova & (pgsize - 1)) + page_offset) / PAGE_SIZE; ``` This calculates the remaining pages within the current PTE from the current position. However, the `+ page_offset` term is wrong. `page_offset` represents the sub-page offset at the *start* of the area, and it's only non-zero for the first iteration (`start_index == iopt_area_index(area)`). Adding it increases npages, potentially stepping *past* the PTE boundary. Consider: if `iova` is at offset 0 within a 2MB PTE and `page_offset` is, say, 0x800 (2048 bytes), then: - `pgsize - (iova & (pgsize - 1))` = 2MB (correct remaining within PTE) - Adding `page_offset` makes it 2MB + 2048 - Dividing by PAGE_SIZE gives 512 + 0 (integer division), which happens to be correct - But if page_offset were larger or aligned differently, this could yield 513 The `page_offset` should NOT be added here. The number of 4K pages remaining in the PTE from our current IOVA position is simply `(pgsize - (iova & (pgsize - 1))) / PAGE_SIZE`. The `page_offset` adjustment is already handled by the `iova += npages * PAGE_SIZE - page_offset` at the bottom. **VFIO type1 -- len calculation:** ```c len = pgsize - (iova & (pgsize - 1)); ``` This correctly calculates the remaining bytes from `iova` to the end of the current PTE. Then the loop tries to merge with subsequent physically contiguous PTEs: ```c for (; pos + len < dma->size; ) { ... len += next_pgsize; } ``` When `next_pgsize` is returned as the full PTE size, `len` accumulates the total contiguous region. However, the original code incremented by `PAGE_SIZE` and checked physical contiguity at each step. The new code adds `next_pgsize` (the full PTE size) and only checks contiguity at PTE boundaries. This is correct *if* each PTE is fully contiguous internally (which it is by definition), but it changes the granularity of the contiguity check. This is fine semantically. **However**, there's a subtle issue: if `len` overshoots `dma->size` (because the last PTE extends beyond the mapped region), the loop condition `pos + len < dma->size` handles it. But the `len` value passed to `unmap_unpin_fast/slow` could exceed the remaining `dma->size - pos`. Looking at the original code, it had the same potential since `len` was accumulated without a size cap, and the unmap functions handle it. So this is OK. **selftest change**: Switching from `mock->domain.ops->iova_to_phys()` to `iommu_iova_to_phys()` in the selftest is a good cleanup -- using the public API instead of reaching into the ops directly. --- Generated by Claude Code Patch Reviewer