前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >十个例子让你了解 strace 的使用技巧

十个例子让你了解 strace 的使用技巧

作者头像
用户3147702
发布2022-06-27 16:54:42
4.2K0
发布2022-06-27 16:54:42
举报
文章被收录于专栏:小脑斧科技博客

1. 引言

此前的文章中我们介绍了 tcpdump 的实用技巧:

计算机网络问题排查(一) -- tcpdump 原理与基础参数

实战计算机网络问题排查(二) -- tcpdump 的过滤指令

tcpdump 作为计算机网络排查的一大神器,掌握了上文所说的技巧,可以让你随时随地得心应手的掌握网络应用的一举一动。

那么,除了得知一个应用正在做着什么样的网络通信,有没有办法知道一个正在运行中的进程到底做了什么呢?

答案当然是可以了,linux 命令 strace 就是跟踪进程行为的一大神器,你可以通过它知道正在执行的进程中到底发生了什么,以及程序为什么出现错误等等。

本文,我们就来详细介绍一下 strace 的十大用法。

2. strace 的安装

如今,linux 几乎都有成熟的包管理机制,strace 的安装也因此变得非常简单:

  • ubuntu/Debian 系统

sudo apt install strace

  • RHEL/CentOS 系统

yum install strace

  • Fedora 系统

dnf install strace

3. 追踪 linux 系统调用

只要在原本命令的前面,加上 strace 关键字,我们就可以看到原本要执行的这个命令到底做了什么,下面就是一个追踪 df 命令的例子:

$ strace df -h execve("/bin/df", ["df", "-h"], [/* 50 vars */]) = 0 brk(NULL) = 0x136e000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f82f78fd000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=147662, ...}) = 0 mmap(NULL, 147662, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f82f78d8000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1868984, ...}) = 0 mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f82f7310000 ...

结果非常容易理解,可以看到,每一行都是一个系统调用,比如:

open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

其中:

  • open: 系统调用名
  • (“/etc/ld.so.cache”, O_RDONLY|O_CLOEXEC) : 系统调用的参数
  • 3: 系统调用的返回结果

4. 只追踪特定的系统调用

上述 df -h 的 strace 结果是非常多的,从中比较难以找到我们真正关心的调用,通过 -e trace 参数,通过传入不同的参数值,就可以过滤出想要的结果了。

4.1 过滤指定系统调用

通过传入系统调用的名称,就可以只查看对应的系统调用了。

$ sudo strace -e trace=write df -h write(1, "Filesystem Size Used Avail"..., 49Filesystem Size Used Avail Use% Mounted on ) = 49 write(1, "udev 3.9G 0 3.9G"..., 43udev 3.9G 0 3.9G 0% /dev ) = 43 write(1, "tmpfs 788M 9.6M 779M"..., 43tmpfs 788M 9.6M 779M 2% /run ) = 43 write(1, "/dev/sda10 324G 252G 56G"..., 40/dev/sda10 324G 252G 56G 82% / ) = 40 write(1, "tmpfs 3.9G 104M 3.8G"..., 47tmpfs 3.9G 104M 3.8G 3% /dev/shm ) = 47 write(1, "tmpfs 5.0M 4.0K 5.0M"..., 48tmpfs 5.0M 4.0K 5.0M 1% /run/lock ) = 48 write(1, "tmpfs 3.9G 0 3.9G"..., 53tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup ) = 53 write(1, "cgmfs 100K 0 100K"..., 56cgmfs 100K 0 100K 0% /run/cgmanager/fs ) = 56 write(1, "tmpfs 788M 28K 788M"..., 53tmpfs 788M 28K 788M 1% /run/user/1000 ) = 53 +++ exited with 0 +++

除此以外,你还可以传入:

  • $ sudo strace -e trace=open,close df -h
  • sudo strace -e trace=open,close,read,write df -h
  • sudo strace -e trace=all df -h

4.2 针对进行管理的追踪

$ sudo strace -q -e trace=process df -h execve("/bin/df", ["df", "-h"], [/* 17 vars */]) = 0 arch_prctl(ARCH_SET_FS, 0x7fe2222ff700) = 0 Filesystem Size Used Avail Use% Mounted on udev 3.9G 0 3.9G 0% /dev tmpfs 788M 9.6M 779M 2% /run /dev/sda10 324G 252G 56G 82% / tmpfs 3.9G 104M 3.8G 3% /dev/shm tmpfs 5.0M 4.0K 5.0M 1% /run/lock tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup cgmfs 100K 0 100K 0% /run/cgmanager/fs tmpfs 788M 28K 788M 1% /run/user/1000 exit_group(0) = ? +++ exited with 0 +++

4.3 针对文件系统调用的追踪

$ sudo strace -q -e trace=file df -h execve("/bin/df", ["df", "-h"], [/* 17 vars */]) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3 ...

4.4 针对内存的追踪

$ sudo strace -q -e trace=memory df -h brk(NULL) = 0x77a000 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe8f4658000 mmap(NULL, 147662, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe8f4633000 mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fe8f406b000 mprotect(0x7fe8f422b000, 2097152, PROT_NONE) = 0 mmap(0x7fe8f442b000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c0000) = 0x7fe8f442b000 mmap(0x7fe8f4431000, 14752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fe8f4431000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe8f4632000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe8f4631000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe8f4630000 mprotect(0x7fe8f442b000, 16384, PROT_READ) = 0 mprotect(0x616000, 4096, PROT_READ) = 0 mprotect(0x7fe8f465a000, 4096, PROT_READ) = 0 munmap(0x7fe8f4633000, 147662) = 0 mmap(NULL, 2981280, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe8f3d93000 brk(NULL) = 0x77a000 brk(0x79b000) = 0x79b000 mmap(NULL, 619, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe8f4657000 mmap(NULL, 26258, PROT_READ, MAP_SHARED, 3, 0) = 0x7fe8f4650000 Filesystem Size Used Avail Use% Mounted on udev 3.9G 0 3.9G 0% /dev tmpfs 788M 9.6M 779M 2% /run /dev/sda10 324G 252G 56G 82% / tmpfs 3.9G 104M 3.8G 3% /dev/shm tmpfs 5.0M 4.0K 5.0M 1% /run/lock tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup cgmfs 100K 0 100K 0% /run/cgmanager/fs tmpfs 788M 28K 788M 1% /run/user/1000 +++ exited with 0 +++

4.5 针对网络的追踪

$ sudo strace -e trace=network df -h

4.6 针对信号的追踪

$ sudo strace -e trace=signal df -h

5. 根据进程PID进行追踪

如果一个进程已经在运行,你可以通过它的pid进行追踪,它会显示追踪后这个进程的系统调用。

$ sudo strace -p 3569 strace: Process 3569 attached restart_syscall(<... resuming interrupted poll ...>) = 1 recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"U\2\24\300!\247\330\0\3\24\4\0\20\0\0\0\0\0\0\24\24\24\24\24\0\0\3\37%\2\0\0", 4096}], msg_controllen=0, msg_flags=0}, 0) = 32 recvmsg(4, 0x7ffee4dbf870, 0) = -1 EAGAIN (Resource temporarily unavailable) recvmsg(4, 0x7ffee4dbf850, 0) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 6, -1) = 1 ([{fd=31, revents=POLLIN}]) read(31, "\372", 1) = 1 recvmsg(4, 0x7ffee4dbf850, 0) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 6, 0) = 1 ([{fd=31, revents=POLLIN}]) read(31, "\372", 1) = 1 recvmsg(4, 0x7ffee4dbf850, 0) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=10, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}], 6, 0) = 0 (Timeout) mprotect(0x207faa20000, 8192, PROT_READ|PROT_WRITE) = 0 mprotect(0x207faa20000, 8192, PROT_READ|PROT_EXEC) = 0 mprotect(0x207faa21000, 4096, PROT_READ|PROT_WRITE) = 0 mprotect(0x207faa21000, 4096, PROT_READ|PROT_EXEC) = 0 ...

6. 得到进程的汇总信息

使用-c参数,可以得到追踪的每一种系统调用的耗时、次数和失败数。

$ sudo strace -c -p 3569 strace: Process 3569 attached ^Cstrace: Process 3569 detached % time seconds usecs/call calls errors syscall


99.73 0.016000 8 1971 poll 0.16 0.000025 0 509 75 futex 0.06 0.000010 0 1985 1966 recvmsg 0.06 0.000009 0 2336 mprotect 0.00 0.000000 0 478 read 0.00 0.000000 0 13 write 0.00 0.000000 0 29 mmap 0.00 0.000000 0 9 munmap 0.00 0.000000 0 18 writev 0.00 0.000000 0 351 madvise 0.00 0.000000 0 1 restart_syscall


100.00 0.016044 7700 2041 total

7. 打印指令指针

-i可以显示每一次系统调用的时候的指令指针

$ sudo strace -i df -h [00007f0d7534c777] execve("/bin/df", ["df", "-h"], [/* 17 vars */]) = 0 [00007faf9cafa4b9] brk(NULL) = 0x12f0000 [00007faf9cafb387] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) [00007faf9cafb47a] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf9cd03000 [00007faf9cafb387] access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) [00007faf9cafb327] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 [00007faf9cafb2b4] fstat(3, {st_mode=S_IFREG|0644, st_size=147662, ...}) = 0 [00007faf9cafb47a] mmap(NULL, 147662, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf9ccde000 [00007faf9cafb427] close(3) = 0 [00007faf9cafb387] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) [00007faf9cafb327] open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 [00007faf9cafb347] read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832 [00007faf9cafb2b4] fstat(3, {st_mode=S_IFREG|0755, st_size=1868984, ...}) = 0 [00007faf9cafb47a] mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7faf9c716000 [00007faf9cafb517] mprotect(0x7faf9c8d6000, 2097152, PROT_NONE) = 0 ...

8. 显示调用时间

-t参数可以显示调用时间。

$ sudo strace -t df -h 15:19:25 execve("/bin/df", ["df", "-h"], [/* 17 vars */]) = 0 15:19:25 brk(NULL) = 0x234c000 15:19:25 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 15:19:25 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8c7f1d9000 15:19:25 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) 15:19:25 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 15:19:25 fstat(3, {st_mode=S_IFREG|0644, st_size=147662, ...}) = 0 15:19:25 mmap(NULL, 147662, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8c7f1b4000 15:19:25 close(3) = 0 15:19:25 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 15:19:25 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 15:19:25 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832 15:19:25 fstat(3, {st_mode=S_IFREG|0755, st_size=1868984, ...}) = 0 15:19:25 mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8c7ebec000 15:19:25 mprotect(0x7f8c7edac000, 2097152, PROT_NONE) = 0 ...

9. 显示系统调用的耗时

-T参数可以显示系统调用的耗时时间。

$ sudo strace -T df -h execve("/bin/df", ["df", "-h"], [/* 17 vars */]) = 0 <0.000287> brk(NULL) = 0xeca000 <0.000035> access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000028> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9aff2b1000 <0.000020> access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000019> open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000022> fstat(3, {st_mode=S_IFREG|0644, st_size=147662, ...}) = 0 <0.000015> mmap(NULL, 147662, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f9aff28c000 <0.000019> close(3) = 0 <0.000014> ...

10. 将追踪结果写入到文件

-o参数将标准输出写入到文件

$ sudo strace -o df_debug.txt df -h Filesystem Size Used Avail Use% Mounted on udev 3.9G 0 3.9G 0% /dev tmpfs 788M 9.6M 779M 2% /run /dev/sda10 324G 252G 56G 82% / tmpfs 3.9G 104M 3.8G 3% /dev/shm tmpfs 5.0M 4.0K 5.0M 1% /run/lock tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup cgmfs 100K 0 100K 0% /run/cgmanager/fs tmpfs 788M 28K 788M 1% /run/user/1000

11. 显示strace的debug信息

-d可以显示strace的debug信息。

$ strace -d df -hptrace_setoptions = 0x11new tcb for pid 5637, active tcbs:1[wait(0x80137f) = 5637] ?? (128),PTRACE_EVENT?? (128)pid 5637 has TCB_STARTUP, initializing itsetting opts 11 on pid 5637[wait(0x80057f) = 5637] ?? (128),PTRACE_EVENT?? (128)[wait(0x127f) = 5637] WIFSTOPPED,sig=SIGCONT[wait(0x857f) = 5637] WIFSTOPPED,sig=133execve("/usr/bin/df", ["df", "-h"], [/* 22 vars */] [wait(0x4057f) = 5637] WIFSTOPPED,sig=SIGTRAP,PTRACE_EVENT_EXEC[wait(0x857f) = 5637] WIFSTOPPED,sig=133) = 0[wait(0x857f) = 5637] WIFSTOPPED,sig=133 ...

附录 -- 参考资料

https://www.tecmint.com/strace-commands-for-troubleshooting-and-debugging-linux/

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2022-03-05,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 小脑斧科技博客 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1. 引言
  • 2. strace 的安装
  • 3. 追踪 linux 系统调用
  • 4. 只追踪特定的系统调用
    • 4.1 过滤指定系统调用
      • 4.2 针对进行管理的追踪
        • 4.3 针对文件系统调用的追踪
          • 4.4 针对内存的追踪
            • 4.5 针对网络的追踪
              • 4.6 针对信号的追踪
              • 5. 根据进程PID进行追踪
              • 6. 得到进程的汇总信息
              • 7. 打印指令指针
              • 8. 显示调用时间
              • 9. 显示系统调用的耗时
              • 10. 将追踪结果写入到文件
              • 11. 显示strace的debug信息
              • 附录 -- 参考资料
              领券
              问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档