页迁移技术是内核中内存管理的一种比较重要的技术,最早该技术诞生于NUMA系统中(Page migration [LWN.net]),后续由于内存规整以及CMA和COW技术的出现,也需要用到页迁移技术,逐渐称为内核内存子系统中占有比较重要地位。
NUMA系统中,每个cpu运行的进程申请内存时尽量从该cpu本地节点中内存中(local memory)速度,这样才能够获得最佳内存访问性能,但是由于内核进程调度系统,当进程数量过多时,调度系统会将进程从一个cpu调度到另外一个cpu中,切换之后就会造成在旧cpu节点上申请的内存称为远程内存(remote memory),访问内存时就会造成跨节点访问,性能较差:

如上图一个NUMA 系统说明:
如有一个P0线程,开始是运行在cpu0节点上,后续由于调度系统发生迁移到CPU1上,为提高性能就会发生页迁移动作
一般在NUMA系统中由于进程调度造成 正在运行的进程切换节点,由于访问远程节点开销,在NUMA系统中经常会出现一个性能抖动,通过页迁移技术之后,性能又得到恢复,因此在NUMA系统中要尽量避免进程在节点之前进行切换。
内存规整技术是Mel Gormal开发的解决内存碎片化技术的第二个部分:

系统在经长期运行之后,会产生比较严重的碎片化,内核采用各种技术对内存碎片化进行优化,启动内存规整技术就是一个比较重要的技术。通过内存规整技术,将不连续的分散使用的物理内存通过页迁移技术将其集合在一起,以便腾出连续空闲物理内存,给大片内存使用。
内核还提供了一种手动启动内存规整方法:
echo 1 >/proc/sys/vm/compact_memoryCMA为解决DMA申请连续物理内存必须做预留造成内存浪费问题,专门划分出一块区域给CMA使用:

当位于CMA are中的内存被MOVE类型申请占有之后,如果调用dma_alloc_contiguous申请连续物理内存之后,发现cma are内内存被占有,会启动页迁移功能将将move类型内存迁移到are之外,以腾出连续可用物理内存。
以上是三个比较经常使用到的page migrate 场景,当然还有其他场景也会使用到例如cow、透明巨页等场景。
页迁移不仅仅是内核自动触发进行迁移,还提供了系统调用,供用户层进行根据情况使用:
#include <numaif.h>
long migrate_pages(int pid, unsigned long maxnode,
const unsigned long *old_nodes,
const unsigned long *new_nodes);系统调用是尝试将指定old_nodes节点上属于进程pid的所有物理页迁移到new_nodes新节点上。
入参:
整个页迁移组成大概如下:

系统调用migrate_pages通过中断陷入内核中调用kernel_migrate_pages,最终调用内核函数migrate_pages实施页迁移。
内核migrate_pages函数如果是huge pge则调用unmap_move_huge_page将旧的huge page 对应所有进程pte 接触,然后申请新的huge page 并将old huge page内存copy到new page中,最后并刷新映射。
如果是normal page则调用unmap_and_move处理类似同样 接触旧page所有进程映射,申请新page 并同步page内存以及迁移页表。
以下是整理会触发page migration主要一些情况:

内核migrate_pages函数为实施页迁移函数入口,函数定义如下:
int migrate_pages(struct list_head *from, new_page_t get_new_page,
free_page_t put_new_page, unsigned long private,
enum migrate_mode mode, int reason)参数:
migrate_mode迁移模式主要有以下几个:
用于说明迁移原因:
内核migrate_pages处理相对来说比较复杂,内核文档(Page migration — The Linux Kernel documentation)中给出了 迁移过程说明:

该上述过程分散穿插到代码过程中,migrate_pages(mm\migrate.c文件中):

__unmap_and_move为实施页迁移具体函数,整个处理过程思路如下:

比较几个关键处理:
__unmap_and_move代码如下:
static int __unmap_and_move(struct page *page, struct page *newpage,
int force, enum migrate_mode mode)
{
int rc = -EAGAIN;
int page_was_mapped = 0;
struct anon_vma *anon_vma = NULL;
bool is_lru = !__PageMovable(page);
if (!trylock_page(page)) {
if (!force || mode == MIGRATE_ASYNC)
goto out;
/*
* It's not safe for direct compaction to call lock_page.
* For example, during page readahead pages are added locked
* to the LRU. Later, when the IO completes the pages are
* marked uptodate and unlocked. However, the queueing
* could be merging multiple pages for one bio (e.g.
* mpage_readahead). If an allocation happens for the
* second or third page, the process can end up locking
* the same page twice and deadlocking. Rather than
* trying to be clever about what pages can be locked,
* avoid the use of lock_page for direct compaction
* altogether.
*/
if (current->flags & PF_MEMALLOC)
goto out;
lock_page(page);
}
if (PageWriteback(page)) {
/*
* Only in the case of a full synchronous migration is it
* necessary to wait for PageWriteback. In the async case,
* the retry loop is too short and in the sync-light case,
* the overhead of stalling is too much
*/
switch (mode) {
case MIGRATE_SYNC:
case MIGRATE_SYNC_NO_COPY:
break;
default:
rc = -EBUSY;
goto out_unlock;
}
if (!force)
goto out_unlock;
wait_on_page_writeback(page);
}
/*
* By try_to_unmap(), page->mapcount goes down to 0 here. In this case,
* we cannot notice that anon_vma is freed while we migrates a page.
* This get_anon_vma() delays freeing anon_vma pointer until the end
* of migration. File cache pages are no problem because of page_lock()
* File Caches may use write_page() or lock_page() in migration, then,
* just care Anon page here.
*
* Only page_get_anon_vma() understands the subtleties of
* getting a hold on an anon_vma from outside one of its mms.
* But if we cannot get anon_vma, then we won't need it anyway,
* because that implies that the anon page is no longer mapped
* (and cannot be remapped so long as we hold the page lock).
*/
if (PageAnon(page) && !PageKsm(page))
anon_vma = page_get_anon_vma(page);
/*
* Block others from accessing the new page when we get around to
* establishing additional references. We are usually the only one
* holding a reference to newpage at this point. We used to have a BUG
* here if trylock_page(newpage) fails, but would like to allow for
* cases where there might be a race with the previous use of newpage.
* This is much like races on refcount of oldpage: just don't BUG().
*/
if (unlikely(!trylock_page(newpage)))
goto out_unlock;
if (unlikely(!is_lru)) {
rc = move_to_new_page(newpage, page, mode);
goto out_unlock_both;
}
/*
* Corner case handling:
* 1. When a new swap-cache page is read into, it is added to the LRU
* and treated as swapcache but it has no rmap yet.
* Calling try_to_unmap() against a page->mapping==NULL page will
* trigger a BUG. So handle it here.
* 2. An orphaned page (see truncate_complete_page) might have
* fs-private metadata. The page can be picked up due to memory
* offlining. Everywhere else except page reclaim, the page is
* invisible to the vm, so the page can not be migrated. So try to
* free the metadata, so the page can be freed.
*/
if (!page->mapping) {
VM_BUG_ON_PAGE(PageAnon(page), page);
if (page_has_private(page)) {
try_to_free_buffers(page);
goto out_unlock_both;
}
} else if (page_mapped(page)) {
/* Establish migration ptes */
VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma,
page);
try_to_unmap(page,
TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
page_was_mapped = 1;
}
if (!page_mapped(page))
rc = move_to_new_page(newpage, page, mode);
if (page_was_mapped)
remove_migration_ptes(page,
rc == MIGRATEPAGE_SUCCESS ? newpage : page, false);
out_unlock_both:
unlock_page(newpage);
out_unlock:
/* Drop an anon_vma reference if we took one */
if (anon_vma)
put_anon_vma(anon_vma);
unlock_page(page);
out:
/*
* If migration is successful, decrease refcount of the newpage
* which will not free the page because new page owner increased
* refcounter. As well, if it is LRU page, add the page to LRU
* list in here. Use the old state of the isolated source page to
* determine if we migrated a LRU page. newpage was already unlocked
* and possibly modified by its owner - don't rely on the page
* state.
*/
if (rc == MIGRATEPAGE_SUCCESS) {
if (unlikely(!is_lru))
put_page(newpage);
else
putback_lru_page(newpage);
}
return rc;
}