专栏首页Linux内核深入分析从备用类型总盗用steal page

从备用类型总盗用steal page

在之前的文章中,当分配一页的时候从对应order的对应的迁移类型中freelist中分配一个空闲的页。但是也会出现此order的迁移类型中没有可用的page,这时候就会从备用的迁移类型中盗用page

static __always_inline struct page *
__rmqueue(struct zone *zone, unsigned int order, int migratetype, unsigned int alloc_flags)
{
    struct page *page;
 
retry:
    page = __rmqueue_smallest(zone, order, migratetype);
 
    if (unlikely(!page) && __rmqueue_fallback(zone, order, migratetype,
                                alloc_flags))
        goto retry;
 
    trace_mm_page_alloc_zone_locked(page, order, migratetype);
    return page;
}
  • 首先会从__rmqueue_smallest去分配一个空闲的页,当没有分配到空闲的页时
  • 则会调用__rmqueue_fallback去备用的迁移类型去盗用空闲页
static __always_inline bool
__rmqueue_fallback(struct zone *zone, int order, int start_migratetype, unsigned int alloc_flags)
{
    struct free_area *area;
    int current_order;
    int min_order = order;
    struct page *page;
    int fallback_mt;
    bool can_steal;
 
    for (current_order = MAX_ORDER - 1; current_order >= min_order; --current_order) {
        area = &(zone->free_area[current_order]);
        fallback_mt = find_suitable_fallback(area, current_order, start_migratetype, false, &can_steal, order);
        if (fallback_mt == -1)
            continue;
 
        goto do_steal;
    }
    return false;
 
do_steal:
    page = list_first_entry(&area->free_list[fallback_mt], struct page, lru);
    steal_suitable_fallback(zone, page, alloc_flags, start_migratetype, can_steal);
    return true;
}
  • 首先从最大的order到min_order中遍历,通过find_suitable_fallback此函数找到一个匹配的迁移类型
  • 然后从此迁移类型中找到一个可用的page
  • 调用steal_suitable_fallback进行真正的page的迁移

那从备用的迁移类型中盗用页,应该符合什么规则才可以盗用呢? 内核定义了一个二维数组来描述迁移的规则

static int fallbacks[MIGRATE_TYPES][4] = {
    [MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,   MIGRATE_TYPES },
    [MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,   MIGRATE_TYPES },
    [MIGRATE_MOVABLE]     = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE, MIGRATE_TYPES },
#ifdef CONFIG_CMA
    [MIGRATE_CMA]         = { MIGRATE_TYPES }, /* Never used */
#endif
#ifdef CONFIG_MEMORY_ISOLATION
    [MIGRATE_ISOLATE]     = { MIGRATE_TYPES }, /* Never used */
#endif
};
  • 不可移动的备用迁移类型优先级顺序:MIGRATE_RECLAIMABLE > MIGRATE_MOVABLE
  • 可回收的备用迁移类型优先级顺序: MIGRATE_UNMOVABLE > MIGRATE_MOVABLE
  • 可移动的备份迁移类型优先级顺序: MIGRATE_RECLAIMABLE > MIGRATE_UNMOVABLE
int find_suitable_fallback(struct free_area *area, unsigned int order,
            int migratetype, bool only_stealable, bool *can_steal, unsigned int start_order)
{
    int i;
    int fallback_mt;
 
    if (area->nr_free == 0)
        return -1;
 
    *can_steal = false;
    for (i = 0;; i++) {
        fallback_mt = fallbacks[migratetype][i];
        if (fallback_mt == MIGRATE_TYPES)
            break;
 
        if (list_empty(&area->free_list[fallback_mt]))
            continue;
 
        if (can_steal_fallback(order, migratetype, fallback_mt, start_order))
            *can_steal = true;
 
        if (!only_stealable)
            return fallback_mt;
 
        if (*can_steal)
            return fallback_mt;
    }
 
    return -1;
}

此函数主要的用途是找到合适的迁移类型

  • 根据当前的迁移类型获取到一个备份的迁移类型,如果迁移类型MIGRATE_TYPES,则break
  • 如果当前的迁移类型的freelist的链表为空,说明备份的迁移类型没有可用的页,则去下一优先级获取页
  • 此函数can_steal_fallback中来判断此迁移类型是否可以作为盗用迁移类型,如果是返回true即可

如果获取到一个备份的迁移类型,则从此迁移类型的freelist中获取到一个可以的page:list_first_entry(&area->free_list[fallback_mt],struct page, lru);

static void steal_suitable_fallback(struct zone *zone, struct page *page, unsigned int alloc_flags, int start_type, bool whole_block)
{
    unsigned int current_order = page_order(page);
    struct free_area *area;
    int free_pages, movable_pages, alike_pages;
    int old_block_type;
 
    old_block_type = get_pageblock_migratetype(page);
 
    /*
     * This can happen due to races and we want to prevent broken
     * highatomic accounting.
     */
    if (is_migrate_highatomic(old_block_type))
        goto single_page;
 
    /* Take ownership for orders >= pageblock_order */
    if (current_order >= pageblock_order) {
        change_pageblock_range(page, current_order, start_type);
        goto single_page;
    }
 
    /*
     * Boost watermarks to increase reclaim pressure to reduce the
     * likelihood of future fallbacks. Wake kswapd now as the node
     * may be balanced overall and kswapd will not wake naturally.
     */
    boost_watermark(zone);
    if (alloc_flags & ALLOC_KSWAPD)
        set_bit(ZONE_BOOSTED_WATERMARK, &zone->flags);
 
    /* We are not allowed to try stealing from the whole block */
    if (!whole_block)
        goto single_page;
 
    free_pages = move_freepages_block(zone, page, start_type,
                        &movable_pages);
    /*
     * Determine how many pages are compatible with our allocation.
     * For movable allocation, it's the number of movable pages which
     * we just obtained. For other types it's a bit more tricky.
     */
    if (start_type == MIGRATE_MOVABLE) {
        alike_pages = movable_pages;
    } else {
        /*
         * If we are falling back a RECLAIMABLE or UNMOVABLE allocation
         * to MOVABLE pageblock, consider all non-movable pages as
         * compatible. If it's UNMOVABLE falling back to RECLAIMABLE or
         * vice versa, be conservative since we can't distinguish the
         * exact migratetype of non-movable pages.
         */
        if (old_block_type == MIGRATE_MOVABLE)
            alike_pages = pageblock_nr_pages
                        - (free_pages + movable_pages);
        else
            alike_pages = 0;
    }
 
    /* moving whole block can fail due to zone boundary conditions */
    if (!free_pages)
        goto single_page;
 
    /*
     * If a sufficient number of pages in the block are either free or of
     * comparable migratability as our allocation, claim the whole block.
     */
    if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
            page_group_by_mobility_disabled)
        set_pageblock_migratetype(page, start_type);
 
    return;
 
single_page:
    area = &zone->free_area[current_order];
    list_move(&page->lru, &area->free_list[start_type]);
}

此函数的重点是:将盗用的迁移页move到需要申请的迁移页的freelist链表中

  • 需要判断是否是只move一个page,如果是跳到single_page处进行一个list_move,将盗用的page挂到之前申请时候的freelist中
  • 如果是需要move一个block的话,则调用move_freepages_block来move一整个wholeblock
int move_freepages_block(struct zone *zone, struct page *page, int migratetype, int *num_movable)
{
    unsigned long start_pfn, end_pfn;
    struct page *start_page, *end_page;
 
    start_pfn = page_to_pfn(page);
    start_pfn = start_pfn & ~(pageblock_nr_pages-1);
    start_page = pfn_to_page(start_pfn);
    end_page = start_page + pageblock_nr_pages - 1;
    end_pfn = start_pfn + pageblock_nr_pages - 1;
 
    /* Do not cross zone boundaries */
    if (!zone_spans_pfn(zone, start_pfn))
        start_page = page;
    if (!zone_spans_pfn(zone, end_pfn))
        return 0;
 
    return move_freepages(zone, start_page, end_page, migratetype, num_movable);
}
  • 根据盗用的页,获取页的页帧,根据对齐pageblock_nr_pages获取一个真正的page
  • 获取end_page和end_pfn的值,其中的大小就是pageblock_nr_pages
  • 然后调用move_freepages用来移动整个block
static int move_freepages(struct zone *zone, struct page *start_page, struct page *end_page,  int migratetype, int *num_movable)
{
    struct page *page;
    unsigned int order;
    int pages_moved = 0;
 
    if (num_movable)
        *num_movable = 0;
 
    for (page = start_page; page <= end_page;) {
        if (!pfn_valid_within(page_to_pfn(page))) {
            page++;
            continue;
        }
 
        /* Make sure we are not inadvertently changing nodes */
        VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
 
        if (!PageBuddy(page)) {
            /*
             * We assume that pages that could be isolated for
             * migration are movable. But we don't actually try
             * isolating, as that would be expensive.
             */
            if (num_movable &&
                    (PageLRU(page) || __PageMovable(page)))
                (*num_movable)++;
 
            page++;
            continue;
        }
 
        order = page_order(page);
        list_move(&page->lru,
              &zone->free_area[order].free_list[migratetype]);
        page += 1 << order;
        pages_moved += 1 << order;
    }
 
    return pages_moved;
}

for循环来进行move页

  1. 如果此page的页帧不是有效的,则跳过
  2. 如果此page和zone是不是同一个node,不是则跳过
  3. 如果page不在buddy中,则跳过
  4. 如果条件符合,则move此page,将page从盗用freelist move到需要的freelist中

当申请一个page的时候,去对应order的freelist的迁移类型链表中找对应的page,如果没有找到对应的page,则就会去对应类型的盗用类型的freelist去获取page,将此page挂载到之前需要申请的freelsit中,然后进行retry再通过__rmqueue_smallest申请一次即可。

举个简单的例子:申请一个order=5的MIGRATE_RECLAIMABLE的page

  1. 先去free_area[5].freelist[MIGRATE_RECLAIMABLE]去获取对应的page
  2. 发现此order对应的page没有可用的。
  3. 则去fallbacks[MIGRATE_RECLAIMABLE]对应的数组中获取迁移类型[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE, MIGRATE_MOVABLE, MIGRATE_TYPES },
  4. 假设MIGRATE_UNMOVABLE迁移类型没找到可用的页,则去MIGRATE_MOVABLE去获取到可用的页。
  5. 假设到MIGRATE_MOVABLE的迁移类型有可用的page
  6. 则将此page 移动 到free_area[5].freelist[MIGRATE_RECLAIMABLE]中
  7. 则进行一次重新的申请,即可

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • Buddy(伙伴)系统分配器之分配page

    Buddy分配器是按照页为单位分配和释放物理内存的,在Zone那一节文章中freearea就是通过buddy分配器来管理的。

    DragonKingZhu
  • 物理内存是如何组织管理的

    内存管理,相比大家都听过。但是内存管理到底是做什么呢?这就得从计算机刚出来的时候说起。计算机刚出来的时候内存资源很紧张,只有几十K,后来慢慢的到几百K,到周后来...

    DragonKingZhu
  • 快车道-分配页

    DragonKingZhu
  • django 分页

    https://www.jianshu.com/p/77a8ea421e22 https://blog.csdn.net/weixin_42134789/ar...

    晴天Online
  • Django自定义分页

    py3study
  • CRM之分页

      分页功能在网页中是非常常见的一个功能,其作用也就是将数据分割成多个页面来进行显示。

    py3study
  • Python自动化开发学习21-Djan

    在url.py里,除了默认会传一个request给处理函数,还可以传递额外的参数,把一个字典作为第三个参数传入,之后就可以在处理函数里取到对应的值:

    py3study
  • optimize 回收表空间的一些说明

    线上服务器,有张大表需要用pt-archiver根据时间划分归档大量数据到另一个新表中。原先200G的表,在归档完成后,du -hs 显示依然是200G的大小,...

    二狗不要跑
  • 13.Django基础之django分页

      我们使用脚本批量创建一些测试数据(将下面的代码保存到bulk_create.py文件中放到Django项目的根目录,直接执行即可。):

    changxin7
  • Discuz分页函数及使用

    该函数在 ./include/global.func.php 文件中定义。函数原型为: string multi(int $num, int $perpage,...

    96php.cn

扫码关注云+社区

领取腾讯云代金券