vfs: fix page locking deadlocks when deduping files [Linux 4.19.72]

This Linux kernel change "vfs: fix page locking deadlocks when deduping files" is included in the Linux 4.19.72 release. This change is authored by Darrick J. Wong <darrick.wong [at] oracle.com> on Sun Aug 11 15:52:25 2019 -0700. The commit for this change in Linux stable tree is ac3cc25 (patch) which is from upstream commit edc58dd. The same Linux upstream change may have been applied to various maintained Linux releases and you can find all Linux releases containing changes from upstream edc58dd.

vfs: fix page locking deadlocks when deduping files

[ Upstream commit edc58dd0123b552453a74369bd0c8d890b497b4b ]

When dedupe wants to use the page cache to compare parts of two files
for dedupe, we must be very careful to handle locking correctly.  The
current code doesn't do this.  It must lock and unlock the page only
once if the two pages are the same, since the overlapping range check
doesn't catch this when blocksize < pagesize.  If the pages are distinct
but from the same file, we must observe page locking order and lock them
in order of increasing offset to avoid clashing with writeback locking.

Fixes: 876bec6f9bbfcb3 ("vfs: refactor clone/dedupe_file_range common functions")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

There are 49 lines of Linux source code added/deleted in this change. Code changes to Linux kernel are as follows.

 fs/read_write.c | 49 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 41 insertions(+), 8 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 85fd7a8..5fb5ee5 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1888,10 +1888,7 @@ int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
 }
 EXPORT_SYMBOL(vfs_clone_file_range);

-/*
- * Read a page's worth of file data into the page cache.  Return the page
- * locked.
- */
+/* Read a page's worth of file data into the page cache. */
 static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset)
 {
    struct address_space *mapping;
@@ -1907,11 +1904,33 @@ static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset)
        put_page(page);
        return ERR_PTR(-EIO);
    }
-   lock_page(page);
    return page;
 }

 /*
+ * Lock two pages, ensuring that we lock in offset order if the pages are from
+ * the same file.
+ */
+static void vfs_lock_two_pages(struct page *page1, struct page *page2)
+{
+   /* Always lock in order of increasing index. */
+   if (page1->index > page2->index)
+       swap(page1, page2);
+
+   lock_page(page1);
+   if (page1 != page2)
+       lock_page(page2);
+}
+
+/* Unlock two pages, being careful not to unlock the same page twice. */
+static void vfs_unlock_two_pages(struct page *page1, struct page *page2)
+{
+   unlock_page(page1);
+   if (page1 != page2)
+       unlock_page(page2);
+}
+
+/*
  * Compare extents of two files to see if they are the same.
  * Caller must have locked both inodes to prevent write races.
  */
@@ -1948,10 +1967,24 @@ int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
        dest_page = vfs_dedupe_get_page(dest, destoff);
        if (IS_ERR(dest_page)) {
            error = PTR_ERR(dest_page);
-           unlock_page(src_page);
            put_page(src_page);
            goto out_error;
        }
+
+       vfs_lock_two_pages(src_page, dest_page);
+
+       /*
+        * Now that we've locked both pages, make sure they're still
+        * mapped to the file data we're interested in.  If not,
+        * someone is invalidating pages on us and we lose.
+        */
+       if (!PageUptodate(src_page) || !PageUptodate(dest_page) ||
+           src_page->mapping != src->i_mapping ||
+           dest_page->mapping != dest->i_mapping) {
+           same = false;
+           goto unlock;
+       }
+
        src_addr = kmap_atomic(src_page);
        dest_addr = kmap_atomic(dest_page);

@@ -1963,8 +1996,8 @@ int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,

        kunmap_atomic(dest_addr);
        kunmap_atomic(src_addr);
-       unlock_page(dest_page);
-       unlock_page(src_page);
+unlock:
+       vfs_unlock_two_pages(src_page, dest_page);
        put_page(dest_page);
        put_page(src_page);

Leave a Reply

Your email address will not be published. Required fields are marked *