Basic Concept
block
block group
.每个block group
有被分割为多个block
.每个block
是文件在磁盘上连续空间。当使用默认文件系统参数时候,每个block
大小是4k.当touch
文件不写入任何数据时候,文件是不会被分配任何的block
,当写入一定数据时候会以block
为基本单位进行分配空间给文件。// 创建空文件
$ touch /mnt/ext4/a.txt
// Blocks是磁盘扇区的大小,IO Block是操作系统每次IO的大小(4K),明显占用扇区的大小为0
$ stat /mnt/ext4/a.txt
File: /mnt/ext4/a.txt
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 840h/2112d Inode: 12 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-10-03 01:54:44.733048293 +0000
Modify: 2022-10-03 01:54:44.733048293 +0000
Change: 2022-10-03 01:54:44.733048293 +0000
Birth: 2022-10-03 01:54:44.733048293 +0000
// 尝试写入数据写入到文件
$ echo -n "hello" > /mnt/ext4/a.txt
// 这里看到ext4文件系统分配了8个扇区,每个扇区大小是512个字节,所以8个扇区大小就是IO Block大小
$ stat /mnt/ext4/a.txt
File: /mnt/ext4/a.txt
Size: 5 Blocks: 8 IO Block: 4096 regular file
Device: 840h/2112d Inode: 12 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-10-03 01:54:44.733048293 +0000
Modify: 2022-10-03 01:55:13.578830808 +0000
Change: 2022-10-03 01:55:13.578830808 +0000
Birth: 2022-10-03 01:54:44.733048293 +0000
// 查看文件block信息,这里a.txt文件没有分配任何的extent.因为大小就几个字节,用一个直接数据块存储即可。
$ filefrag -v /mnt/ext4/a.txt
Filesystem type is: ef53
File size of /mnt/ext4/a.txt is 5 (1 block of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 0: 1015808.. 1015808: 1: last,eof
/mnt/ext4/a.txt: 1 extent found
block group
block group
是一组连续的block
的集合。默认的情况下8个 block
组成一个block group
。$ dumpe2fs /dev/sde |egrep 'Blocks per group|Block size'
dumpe2fs 1.46.5 (30-Dec-2021)
Block size: 4096
// 每个block group大小,这里是32K,Block 大小是4K,Block Group包含了8个Block
Blocks per group: 32768
inode
inode
是文件系统中每个文件的唯一标识,映射IO Block
到磁盘扇区的对应关系。inode
一般存储了文件的acess/modify/create
的时间、访问权限、以及最重要的这个文件包含了哪些Blocks
.这里需要注意的是ext4系统中当删除文件的时候,这个文件的inode
是可以被回收然后被新文件重用。
$ stat /mnt/ext4/a.txt
File: /mnt/ext4/a.txt
Size: 5 Blocks: 8 IO Block: 4096 regular file
Device: 840h/2112d Inode: 12 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-10-03 01:54:44.733048293 +0000
Modify: 2022-10-03 01:55:13.578830808 +0000
Change: 2022-10-03 01:55:13.578830808 +0000
Birth: 2022-10-03 01:54:44.733048293 +0000
$ rm -rf /mnt/ext4/a.txt
$ touch /mnt/ext4/b.ext
$ stat /mnt/ext4/b.ext
File: /mnt/ext4/b.ext
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 840h/2112d Inode: 12 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-10-03 02:23:49.608093750 +0000
Modify: 2022-10-03 02:23:49.608093750 +0000
Change: 2022-10-03 02:23:49.608093750 +0000
Birth: 2022-10-03 02:23:49.608093750 +0000
superblock
superblock
是存储了ext4
文件系统的元数据信息,一旦superblock
损坏整个文件系统是无法访问的。因此superblock
会有多个备份。ext4
文件系统采用了sparse_super
方式备份superblock
.$ mkfs.ext4 /dev/sde
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done
Creating filesystem with 1048576 4k blocks and 262144 inodes
Filesystem UUID: a670d8f3-a153-43d3-a8c7-e4d67ecb2c63
// 超级块的备份,这里以32K为基本基线,分别是32k 1倍、3倍、5倍、7倍位置的block备份超级块
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
ex4
文件系统的sparse_super
.INODE_UNINIT
是inode bitmap和inode table没有被初始化。BLOCK_UNINIT
是block bitmap没有被初始化。$ dumpe2fs /dev/sdf |grep superblock -B1
dumpe2fs 1.46.5 (30-Dec-2021)
Group 0: (Blocks 0-32767) csum 0xceed [ITABLE_ZEROED]
Primary superblock at 0, Group descriptors at 1-1
--
Group 1: (Blocks 32768-65535) csum 0x35e5 [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 32768, Group descriptors at 32769-32769
--
Group 3: (Blocks 98304-131071) csum 0x7ae0 [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 98304, Group descriptors at 98305-98305
--
Group 5: (Blocks 163840-196607) csum 0xa0ab [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 163840, Group descriptors at 163841-163841
--
Group 7: (Blocks 229376-262143) csum 0x921b [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 229376, Group descriptors at 229377-229377
--
Group 9: (Blocks 294912-327679) csum 0x6988 [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 294912, Group descriptors at 294913-294913
--
Group 25: (Blocks 819200-851967) csum 0x220e [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 819200, Group descriptors at 819201-819201
--
Group 27: (Blocks 884736-917503) csum 0x68a9 [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Backup superblock at 884736, Group descriptors at 884737-884737
file on disk
extent
ext4
文件系统为了保证读写大文件的效率,采用了extent
和B-Tree
数据结构。这样可以减少大文件的元数据存储量同时也能提供数据块的检索和更新的效率。/********************ext4文件系统**************************/
// 创建一个180M文件file1
[root@ubuntu /mnt/ext4]$ ls -lih
total 180M
12 -rw-r--r-- 1 root root 0 Oct 3 02:23 b.ext
13 -rw-r--r-- 1 root root 180M Oct 3 03:02 file1
11 drwx------ 2 root root 16K Oct 3 01:47 lost+found
// 查看ext4下file1的extent的分配情况,这里180M文件分配了2个extent
[root@ubuntu /mnt/ext4]$ debugfs -R 'stat <13>' /dev/sde
debugfs 1.46.5 (30-Dec-2021)
Inode: 13 Type: regular Mode: 0644 Flags: 0x80000
Generation: 3968155522 Version: 0x00000000:00000001
User: 0 Group: 0 Project: 0 Size: 188477052
File ACL: 0
Links: 1 Blockcount: 368120
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x633a50bc:8166c4c0 -- Mon Oct 3 03:02:20 2022
atime: 0x633a50bc:056bd75c -- Mon Oct 3 03:02:20 2022
mtime: 0x633a50bc:8166c4c0 -- Mon Oct 3 03:02:20 2022
crtime: 0x633a50bc:056bd75c -- Mon Oct 3 03:02:20 2022
Size of extra inode fields: 32
Inode checksum: 0xad1ad685
EXTENTS:
(0-32767):1015808-1048575, (32768-46014):983040-996286
/********************ext3文件系统**************************/
[root@ubuntu /mnt/ext3]$ ls -lih
total 180M
12 -rw-r--r-- 1 root root 180M Oct 3 03:02 file1
11 drwx------ 2 root root 16K Oct 3 03:01 lost+found
[root@ubuntu /mnt/ext3]$ debugfs -R 'stat <12>' /dev/sdf
debugfs 1.46.5 (30-Dec-2021)
Inode: 12 Type: regular Mode: 0644 Flags: 0x0
Generation: 2196076409 Version: 0x00000000:00000001
User: 0 Group: 0 Project: 0 Size: 188477052
File ACL: 0
Links: 1 Blockcount: 368488
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x633a50bf:7d99e208 -- Mon Oct 3 03:02:23 2022
atime: 0x633a50be:ec398a54 -- Mon Oct 3 03:02:22 2022
mtime: 0x633a50bf:7d99e208 -- Mon Oct 3 03:02:23 2022
crtime: 0x633a50be:ec398a54 -- Mon Oct 3 03:02:22 2022
Size of extra inode fields: 32
BLOCKS:
(0-11):17408-17419, (IND):17178, (12-15):17420-17423, (16-63):17200-17247, (64-127):17280-17343, (128-255):17920-18047, (256-1023):18176-18943, (1024-1035):19456-19467, (DIND):17179, (IND):17180, (1036-2059):19468-20491, (IND):17181, (2060-3083):20492-21515, (IND):17182, (3084-4107):21516-22539, (IND):17183, (4108-5131):22540-23563, (IND):17344, (5132-6155):23564-24587, (IND):17345, (6156-7179):24588-25611, (IND):17346, (7180-8203):25612-26635, (IND):17347, (8204-9227):26636-27659, (IND):17348, (9228-10251):27660-28683, (IND):17349, (10252-11275):28684-29707, (IND):17350, (11276-12299):29708-30731, (IND):17351, (12300-13323):30732-31755, (IND):17352, (13324-14335):31756-32767, (14336-14347):34816-34827, (IND):17353, (14348-15371):34828-35851, (IND):17354, (15372-16395):35852-36875, (IND):17355, (16396-17419):36876-37899, (IND):17356, (17420-18443):37900-38923, (IND):17357, (18444-19467):38924-39947, (IND):17358, (19468-20491):39948-40971, (IND):17359, (20492-21515):40972-41995, (IND):17360, (21516-22539):41996-43019, (IND):17361, (22540-23563):43020-44043, (IND):17362, (23564-24587):44044-45067, (IND):17363, (24588-25611):45068-46091, (IND):17364, (25612-26635):46092-47115, (IND):17365, (26636-27659):47116-48139, (IND):17366, (27660-28683):48140-49163, (IND):17367, (28684-29707):49164-50187, (IND):17368, (29708-30731):50188-51211, (IND):17369, (30732-31755):51212-52235, (IND):17370, (31756-32779):52236-53259, (IND):17371, (32780-33803):53260-54283, (IND):17372, (33804-34827):54284-55307, (IND):17373, (34828-35851):55308-56331, (IND):17374, (35852-36875):56332-57355, (IND):17375, (36876-37899):57356-58379, (IND):17376, (37900-38923):58380-59403, (IND):17377, (38924-39947):59404-60427, (IND):17378, (39948-40971):60428-61451, (IND):17379, (40972-41995):61452-62475, (IND):17380, (41996-43019):62476-63499, (IND):17381, (43020-44043):63500-64523, (IND):17382, (44044-45055):64524-65535, (45056-45067):67584-67595, (IND):17383, (45068-46014):67596-68542
TOTAL: 46061
file1
这个文件在ext4
文件系统仅仅使用了2个extent
.180M
大小的文件大大减少了元数据的存储量,这个和ext3
完全不一样[root@ubuntu /mnt/ext4]$ ls -l -ihl
total 180M
12 -rw-r--r-- 1 root root 0 Oct 3 02:23 b.ext
13 -rw-r--r-- 1 root root 180M Oct 3 03:02 file1
11 drwx------ 2 root root 16K Oct 3 01:47 lost+found
[root@ubuntu /mnt/ext4]$ debugfs -R 'extents file1' /dev/sdf
debugfs 1.46.5 (30-Dec-2021)
Level Entries Logical Physical Length Flags
0/ 0 1/ 2 0 - 32767 1015808 - 1048575 32768
0/ 0 2/ 2 32768 - 46014 983040 - 996286 13247
flex_bg
flexible block group(flex_bg)
是一组连续的block group
的集合。每个flex_bg
中的第一个block group
存储当前flex_bg
的bitmap、inode table.