This Linux kernel change "mm: add !pte_present() check on existing hugetlb_entry callbacks" is included in the Linux 3.15 release. This change is authored by Naoya Horiguchi <n-horiguchi [at]> on Fri Jun 6 10:00:01 2014 -0400. The commit for this change in Linux stable tree is d4c5491 (patch).

mm: add !pte_present() check on existing hugetlb_entry callbacks

The age table walker doesn't check non-present hugetlb entry in common
path, so hugetlb_entry() callbacks must check it.  The reason for this
behavior is that some callers want to handle it in its own way.

[ I think that reason is bogus, btw - it should just do what the regular
  code does, which is to call the "pte_hole()" function for such hugetlb
  entries  - Linus]

However, some callers don't check it now, which causes unpredictable
result, for example when we have a race between migrating hugepage and
reading /proc/pid/numa_maps.  This patch fixes it by adding !pte_present
checks on buggy callbacks.

This bug exists for years and got visible by introducing hugepage

ChangeLog v2:
- fix if condition (check !pte_present() instead of pte_present())

Reported-by: Sasha Levin <[email protected]>
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: <[email protected]> [3.12+]
Signed-off-by: Andrew Morton <[email protected]>
[ Backported to 3.15.  Signed-off-by: Josh Boyer <[email protected]> ]
Signed-off-by: Linus Torvalds <[email protected]>

There are 8 lines of Linux source code added/deleted in this change. Code changes to Linux kernel are as follows.

 fs/proc/task_mmu.c | 2 +-
 mm/mempolicy.c     | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 442177b..c4b2646 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1351,7 +1351,7 @@ static int gather_hugetbl_stats(pte_t *pte, unsigned long hmask,
    struct numa_maps *md;
    struct page *page;

-   if (pte_none(*pte))
+   if (!pte_present(*pte))
        return 0;

    page = pte_page(*pte);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 78e1472..30cc47f8 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -526,9 +526,13 @@ static void queue_pages_hugetlb_pmd_range(struct vm_area_struct *vma,
    int nid;
    struct page *page;
    spinlock_t *ptl;
+   pte_t entry;

    ptl = huge_pte_lock(hstate_vma(vma), vma->vm_mm, (pte_t *)pmd);
-   page = pte_page(huge_ptep_get((pte_t *)pmd));
+   entry = huge_ptep_get((pte_t *)pmd);
+   if (!pte_present(entry))
+       goto unlock;
+   page = pte_page(entry);
    nid = page_to_nid(page);
    if (node_isset(nid, *nodes) == !!(flags & MPOL_MF_INVERT))
        goto unlock;

