前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >【Linux】《how linux work》第八章 流程和资源利用的近距离观察(第一部分)

【Linux】《how linux work》第八章 流程和资源利用的近距离观察(第一部分)

原创
作者头像
阿东
发布2024-04-21 08:01:03
510
发布2024-04-21 08:01:03
举报

Chapter 8. A Closer Look at Processes and Resource Utilization(第 8 章 流程和资源利用的近距离观察)

This chapter takes you deeper into the relationships between processes, the kernel, and system resources. There are three basic kinds of hardware resources: CPU, memory, and I/O. Processes vie for these resources, and the kernel’s job is to allocate resources fairly. The kernel itself is also a resource—a software resource that processes use to perform tasks such as creating new processes and communicating with other processes. Many of the tools that you see in this chapter are often thought of as performance-monitoring tools. They’re particularly helpful if your system is slowing to a crawl and you’re trying to figure out why. However, you shouldn’t get too distracted by performance; trying to optimize a system that’s already working correctly is often a waste of time. Instead, concentrate on understanding what the tools actually measure, and you’ll gain great insight into how the kernel works.

本章将深入介绍进程、内核和系统资源之间的关系。硬件资源主要有三种:CPU、内存和I/O。

进程争夺这些资源,而内核的工作是公平地分配资源。

内核本身也是一种资源,进程可以使用它来执行任务,如创建新进程和与其他进程通信。本章中的许多工具通常被视为性能监控工具。

如果您的系统变得缓慢,您可以使用这些工具来找出原因。

然而,不要过于关注性能;试图优化一个已经正常工作的系统通常是浪费时间。

相反,应该集中精力理解这些工具实际测量的内容,从而深入了解内核的工作原理。

8.1 Tracking Processes(追踪进程)

You learned how to use ps in 2.16 Listing and Manipulating Processes to list processes running on your system at a particular time. The ps command lists current processes, but it does little to tell you how processes change over time. Therefore, it won’t really help you to determine which process is using too much CPU time or memory.

您已经学会了如何使用ps命令在2.16节“列出和操作进程”中列出系统上运行的进程。

ps命令列出当前进程,但它很少告诉您进程如何随时间变化。

因此,它无法真正帮助您确定哪个进程使用了过多的CPU时间或内存。

The top program is often more useful than ps because it displays the current system status as well as many of the fields in a ps listing, and it updates the display every second. Perhaps most important is that top shows the most active processes (that is, those currently taking up the most CPU time) at the top of its display

与ps相比,top程序通常更有用,因为它显示当前系统状态以及ps列表中的许多字段,并且每秒更新一次显示。最重要的是,top显示最活跃的进程(即当前占用最多CPU时间的进程)在其显示的顶部。

You can send commands to top with keystrokes. These are some of the most important commands:

您可以使用按键向top发送命令。

以下是一些最重要的命令:

Two other utilities for Linux, similar to top, offer an enhanced set of views and features: atop and htop. Most of the extra features are available from other utilities. For example, htop has many of abilities of the lsof command described in the next section.

与 top 类似,Linux 上的另外两个实用程序提供了一套增强的视图和功能:atop 和 htop。

大多数额外的功能都可以从其他工具中获得。

例如,htop 拥有下一节所述的 lsof 命令的许多功能。

8.2 Finding Open Files with lsof(用 lsof 查找打开的文件)

The lsof command lists open files and the processes using them. Because Unix places a lot of emphasis on files, lsof is among the most useful tools for finding trouble spots. But lsof doesn’t stop at regular files— it can list network resources, dynamic libraries, pipes, and more.

lsof 命令列出打开的文件和使用这些文件的进程。

由于 Unix 非常重视文件,因此 lsof 是查找故障点最有用的工具之一。

但 lsof 并不局限于普通文件,它还能列出网络资源、动态库、管道等。

8.2.1 Reading the lsof Output(读取 lsof 输出)

Running lsof on the command line usually produces a tremendous amount of output. Below is a fragment of what you might see. This output includes open files from the init process as well as a running vi process:

在命令行上运行 lsof 通常会产生大量输出。

下面是你可能看到的一个片段。

该输出包括来自初始进程和正在运行的 vi 进程的打开文件:

代码语言:sh
复制
$ lsof
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
init 1 root cwd DIR 8,1 4096 2 /
init 1 root rtd DIR 8,1 4096 2 /
init 1 root mem REG 8, 47040 9705817 /lib/i386-linuxgnu/libnss_files-2.15.so
init 1 root mem REG 8,1 42652 9705821 /lib/i386-linuxgnu/libnss_nis-2.15.so
init 1 root mem REG 8,1 92016 9705833 /lib/i386-linuxgnu/libnsl-2.15.so
--snip--
vi 22728 juser cwd DIR 8,1 4096 14945078 /home/juser/w/c
vi 22728 juser 4u REG 8,1 1288 1056519 /home/juser/w/c/f
--snip--

The output shows the following fields (listed in the top row):

输出显示了以下字段(按照顶部行的顺序列出):

o COMMAND. The command name for the process that holds the file descriptor.

o PID. The process ID.

o USER. The user running the process.

o FD. This field can contain two kinds of elements. In the output above, the FD column shows the purpose of the file. The FD field can also list the file descriptor of the open file—a number that a process uses together with the system libraries and kernel to identify and manipulate a file.

o TYPE. The file type (regular file, directory, socket, and so on).

o DEVICE. The major and minor number of the device that holds the file.

o SIZE. The file’s size.

o NODE. The file’s inode number.

o NAME. The filename.

o COMMAND:持有文件描述符的进程的命令名称。

o PID:进程ID。

o USER:运行该进程的用户。

o FD:该字段可以包含两种类型的元素。在上面的输出中,FD列显示了文件的用途。FD字段还可以列出打开文件的文件描述符,这是一个进程与系统库和内核一起使用的数字,用于标识和操作文件。

o TYPE:文件类型(普通文件、目录、套接字等)。

o DEVICE:持有文件的设备的主要和次要编号。

o SIZE:文件的大小。

o NODE:文件的inode号。

o NAME:文件名。

The lsof(1) manual page contains a full list of what you might see for each field, but you should be able to figure out what you’re looking at just by looking at the output. For example, look at the entries with cwd in the FD field as highlighted in bold. These lines indicate the current working directories of the processes. Another example is the very last line, which shows a file that the user is currently editing with vi

lsof(1)手册页包含了每个字段可能出现的完整列表,但是通过查看输出,您应该能够弄清楚您正在查看什么。

例如,查看FD字段中以cwd加粗显示的条目。

这些行指示了进程的当前工作目录。

另一个例子是最后一行,显示了用户当前正在使用vi编辑的文件。

输出显示了以下字段(按照顶部行的顺序列出):

o COMMAND:持有文件描述符的进程的命令名称。

o PID:进程ID。

o USER:运行该进程的用户。

o FD:该字段可以包含两种类型的元素。在上面的输出中,FD列显示了文件的用途。FD字段还可以列出打开文件的文件描述符,这是一个进程与系统库和内核一起使用的数字,用于标识和操作文件。

o TYPE:文件类型(普通文件、目录、套接字等)。

o DEVICE:持有文件的设备的主要和次要编号。

o SIZE:文件的大小。

o NODE:文件的inode号。

o NAME:文件名。

lsof(1)手册页包含了每个字段可能出现的完整列表,但是通过查看输出,您应该能够弄清楚您正在查看什么。例如,查看FD字段中以cwd加粗显示的条目。

这些行指示了进程的当前工作目录。另一个例子是最后一行,显示了用户当前正在使用vi编辑的文件。

8.2.2 Using lsof(使用 lsof)

There are two basic approaches to running lsof:

运行lsof有两种基本方法:

o List everything and pipe the output to a command like less, and then search for what you’re looking for. This can take a while due to the amount of output generated.

o Narrow down the list that lsof provides with command-line options.

You can use command-line options to provide a filename as an argument and have lsof list only the entries that match the argument. For example, the following command displays entries for open files in /usr:

  1. 列出所有内容并将输出导入到类似less的命令中,然后搜索你要查找的内容。
  2. 由于生成的输出量很大,这可能需要一些时间。
  3. 使用命令行选项缩小lsof提供的列表。
  4. 你可以使用命令行选项提供一个文件名作为参数,并让lsof只列出与该参数匹配的条目。例如,下面的命令会显示/usr目录中打开文件的条目。
代码语言:sh
复制
$ lsof /usr

To list the open files for a particular process ID, run:

要列出特定进程 ID 的打开文件,请运行

代码语言:sh
复制
$ lsof -p pid

For a brief summary of lsof’s many options, run lsof -h. Most options pertain to the output format. (See Chapter 10 for a discussion of the lsof network features.)

要了解lsof的许多选项的简要概述,请运行lsof -h。大多数选项与输出格式有关。

(有关lsof网络功能的讨论,请参见第10章。)

NOTE lsof is highly dependent on kernel information. If you upgrade your kernel and you’re not routinely updating everything, you might need to upgrade lsof. In addition, if you perform a distribution update to both the kernel and lsof, the updated lsof might not work until you reboot with the new kernel.注意:lsof高度依赖于内核信息。如果您升级了内核,而且您没有定期更新所有内容,您可能需要升级lsof。此外,如果您同时对内核和lsof进行了发行版更新,则更新后的lsof可能在您使用新内核重新启动之前无法正常工作。8.3 Tracing Program Execution and System Calls(追踪程序执行和系统调用)

The tools we’ve seen so far examine active processes. However, if you have no idea why a program dies almost immediately after starting up, even lsof won’t help you. In fact, you’d have a difficult time even running lsof concurrently with a failed command.

到目前为止,我们看到的工具都是用于检查活动进程的。

然而,如果您不知道为什么一个程序在启动后几乎立即崩溃,即使是lsof也无法帮助您。

实际上,您甚至很难在命令失败的同时运行lsof。

The strace (system call trace) and ltrace (library trace) commands can help you discover what a program attempts to do. These tools produce extraordinarily large amounts of output, but once you know what to look for, you’ll have more tools at your disposal for tracking down problems.

strace(系统调用跟踪)和 ltrace(库跟踪)命令可以帮助您发现程序试图做什么。

这些工具产生了非常大量的输出,但是一旦您知道要寻找什么,您将拥有更多的工具来追踪问题。

8.3.1 strace

Recall that a system call is a privileged operation that a user-space process asks the kernel to perform, such as opening and reading data from a file. The strace utility prints all the system calls that a process makes. To see it in action, run this command:

请回忆一下,系统调用是用户空间进程向内核请求执行的特权操作,例如打开和读取文件中的数据。strace实用程序打印出进程所进行的所有系统调用。

要看它的实际效果,请运行以下命令:

代码语言:sh
复制
$ strace cat /dev/null

In Chapter 1, you learned that when one process wants to start another process, it invokes the fork() system call to spawn a copy of itself, and then the copy uses a member of the exec() family of system calls to start running a new program. The strace command begins working on the new process (the copy of the original process) just after the fork() call. Therefore, the first lines of the output from this command should show execve() in action, followed by a memory initialization call, brk(), as follows:

在第 1 章中,我们了解到当一个进程想要启动另一个进程时,它会调用 fork() 系统调用来生成一个自身的副本,然后副本使用 exec() 系列系统调用的一个成员来开始运行一个新程序。

就在 fork() 调用之后,strace 命令开始在新进程(原始进程的副本)上运行。

因此,该命令输出的第一行应显示 execve() 正在运行,随后是内存初始化调用 brk(),如下所示:

代码语言:sh
复制
execve("/bin/cat", ["cat", "/dev/null"], [/* 58 vars */]) = 0
brk(0) = 0x9b65000

The next part of the output deals primarily with loading shared libraries. You can ignore this unless you really want to know what the shared library system does.

输出的下一部分主要涉及加载共享库。

除非你真的想知道共享库系统是做什么的,否则可以忽略这部分内容。

代码语言:sh
复制
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or 
directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0xb77b5000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or 
directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
--snip--
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200^\1"..., 
1024)= 1024

In addition, skip past the mmap output until you get to the lines that look like this:

此外,跳过 mmap 输出,直到看到类似这样的行:

代码语言:sh
复制
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 6), ...}) = 0
open("/dev/null", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
fadvise64_64(3, 0, 0, POSIX_FADV_SEQUENTIAL)= 0
read(3,"", 32768) = 0
close(3) = 0
close(1) = 0
close(2) = 0
exit_group(0) = ?

This part of the output shows the command at work. First, look at the open() call, which opens a file. The 3 is a result that means success (3 is the file descriptor that the kernel returns after opening the file). Below that, you see where cat reads from /dev/null (the read() call, which also has 3 as the file descriptor). Then there’s nothing more to read, so the program closes the file descriptor and exits with exit_group().

这部分输出显示了命令的运行情况。

首先看打开文件的 open() 调用。

3 代表成功的结果(3 是打开文件后内核返回的文件描述符)。

下面是 cat 从 /dev/null 读取的内容(read()调用,文件描述符也是 3)。

然后就没什么可读取的了,所以程序关闭了文件描述符,并通过 exit_group() 退出。

What happens when there’s a problem? Try strace cat not_a_file instead and examine the open() call in the resulting output:

出现问题时会怎样?

试试 strace cat not_a_file,然后检查输出结果中的 open() 调用:

代码语言:sh
复制
open("not_a_file", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or 
directory)

Because open() couldn’t open the file, it returned -1 to signal an error. You can see that strace reports the exact error and gives you a small description of the error.

由于 open() 无法打开文件,它返回了-1 表示出错。

你可以看到,strace 报告了确切的错误,并给出了错误的一小段描述。

Missing files are the most common problems with Unix programs, so if the system log and other log information aren’t very helpful and you have nowhere else to turn, strace can be of great use. You can even use it on daemons that detach themselves. For example:

文件丢失是 Unix 程序最常见的问题,因此如果系统日志和其他日志信息帮不上什么忙,而你又无处求助,strace 就能派上大用场。

你甚至可以把它用在自行分离的守护进程上。例如

代码语言:sh
复制
$ strace -o crummyd_strace -ff crummyd

In this example, the -o option to strace logs the action of any child process that crummyd spawns into crummyd_strace.pid, where pid is the process ID of the child process.

在这个例子中,strace命令的-o选项将crummyd生成的任何子进程的操作记录到crummyd_strace.pid文件中,其中pid是子进程的进程ID。

8.3.2 ltrace(追踪)

The ltrace command tracks shared library calls. The output is similar to that of strace, which is why we’re mentioning it here, but it doesn’t track anything at the kernel level. Be warned that there are many more shared library calls than system calls. You’ll definitely need to filter the output, and ltrace itself has many built-in options to assist you.

ltrace命令用于跟踪共享库调用。

输出与strace类似,这也是为什么我们在这里提到它的原因,但它不会跟踪内核级别的任何内容。

请注意,共享库调用比系统调用要多得多。

您肯定需要过滤输出,并且ltrace本身有许多内置选项可帮助您。

NOTE See 15.1.4 Shared Libraries for more on shared libraries. The ltrace command doesn’t work on statically linked binaries.注意 有关共享库的更多信息,请参阅 15.1.4 共享库。ltrace 命令不适用于静态链接的二进制文件。8.4 Threads(线程)

In Linux, some processes are divided into pieces called threads. A thread is very similar to a process—it has an identifier (TID, or thread ID), and the kernel schedules and runs threads just like processes. However, unlike separate processes, which usually do not share system resources such as memory and I/O connections with other processes, all threads inside a single process share their system resources and some memory.

在Linux中,一些进程被划分为称为线程的片段。

线程与进程非常相似——它有一个标识符(TID,或线程ID),内核会像调度和运行进程一样调度和运行线程。

然而,与通常不与其他进程共享系统资源(如内存和I/O连接)的独立进程不同,单个进程内的所有线程共享其系统资源和一些内存。

8.4.1 Single-Threaded and Multithreaded Processes(单线程和多线程进程)

Many processes have only one thread. A process with one thread is single-threaded, and a process with more than one thread is multithreaded. All processes start out single-threaded. This starting thread is usually called the main thread. The main thread may then start new threads in order for the process to become multithreaded, similar to the way a process can call fork() to start a new process.

许多进程只有一个线程。只有一个线程的进程被称为单线程进程,而有多个线程的进程被称为多线程进程。

所有进程最初都是单线程的。这个起始线程通常被称为主线程。

然后,主线程可以启动新线程,使进程变为多线程,类似于进程可以调用fork()来启动一个新进程。

NOTE It’s rare to refer to threads at all when a process is single-threaded. This book will not mention threads unless multithreaded processes make a difference in what you see or experience.注意 当进程是单线程的时候,很少提到线程。除非多线程进程会对你所见或体验的内容产生影响,本书不会提到线程。

The primary advantage of a multithreaded process is that when the process has a lot to do, threads can run simultaneously on multiple processors, potentially speeding up computation. Although you can also achieve simultaneous computation with multiple processes, threads start faster than processes, and it is often easier and/or more efficient for threads to intercommunicate using their shared memory than it is for processes to communicate over a channel such as a network connection or a pipe.

多线程进程的主要优势在于,当进程有很多事情要做时,线程可以在多个处理器上同时运行,从而可能加快计算速度。

虽然你也可以通过多个进程实现同时计算,但是线程比进程启动更快,而且线程使用共享内存进行相互通信通常更容易和/或更高效,而进程之间的通信则需要使用网络连接或管道等通道。

Some programs use threads to overcome problems managing multiple I/O resources. Traditionally, a process would sometimes use fork() to start a new subprocess in order to deal with a new input or output stream. Threads offer a similar mechanism without the overhead of starting a new process.

一些程序使用线程来解决管理多个I/O资源的问题。

传统上,一个进程有时会使用fork()来启动一个新的子进程,以处理新的输入或输出流。

线程提供了一种类似的机制,但不需要启动一个新进程的开销。

8.4.2 Viewing Threads(查看主题)

By default, the output from the ps and top commands shows only processes. To display the thread information in ps, add the m option. Here is some sample output:

默认情况下,ps 和 top 命令的输出只显示进程。要在 ps 中显示线程信息,请添加 m 选项。下面是一些输出示例:

Example 8-1. Viewing threads with ps m

例 8-1. 使用 ps m 查看线程

代码语言:sh
复制
$ ps m
 PID TTY STAT TIME COMMAND
3587 pts/3 - 0:00 bash➊
 - - Ss 0:00 -
3592 pts/4 - 0:00 bash➋
 - - Ss 0:00 -
12287 pts/8 - 0:54 /usr/bin/python /usr/bin/gm-notify➌
 - - SL1 0:48 -
 - - SL1 0:00 -
 - - SL1 0:06 -
 - - SL1 0:00 -

Example 8-1 shows processes along with threads. Each line with a number in the PID column (at ➊, ➋, and ➌) represents a process, as in the normal ps output. The lines with the dashes in the PID column represent the threads associated with the process. In this output, the processes at ➊ and ➋ have only one thread each, but process 12287 at ➌ is multithreaded with four threads.

例 8-1 显示了进程和线程。

PID 列(➊、➋ 和 ➌)中带有数字的每一行代表一个进程,与正常的 ps 输出一样。

PID 列中的破折号线代表与进程相关的线程。

在此输出中,➊ 和 ➋ 处的进程各有一个线程,但 ➌ 处的进程 12287 是多线程的,有四个线程。

If you would like to view the thread IDs with ps, you can use a custom output format. This example shows only the process IDs, thread IDs, and command:

如果想用 ps 查看线程 ID,可以使用自定义输出格式。本例只显示了进程 ID、线程 ID 和命令:

Example 8-2. Showing process IDs and thread IDs with ps m

例 8-2. 用 ps m 显示进程 ID 和线程 ID

代码语言:sh
复制
$ ps m -o pid,tid,command
 PID TID COMMAND
3587 - bash
 - 3587 -
3592 - bash
 - 3592 -
12287 - /usr/bin/python /usr/bin/gm-notify
 - 12287 -
 - 12288 -
 - 12289 -
 - 12295 -

The sample output in Example 8-2 corresponds to the threads shown in Example 8-1. Notice that the thread IDs of the single-threaded processes are identical to the process IDs; this is the main thread. For the multithreaded process 12287, thread 12287 is also the main thread.

在示例8-2中的示例输出对应于示例8-1中显示的线程。

请注意,单线程进程的线程ID与进程ID相同,这是主线程。

对于多线程进程12287,线程12287也是主线程。

NOTE Normally, you won’t interact with individual threads as you would processes. You need to know a lot about how a multithreaded program was written in order to act on one thread at a time, and even then, doing so might not be a good idea.注意 通常情况下,您不会像处理进程一样与单个线程进行交互。要逐个线程进行操作,您需要了解有关多线程程序的许多信息,即使这样做可能不是一个好主意。

Threads can confuse things when it comes to resource monitoring because individual threads in a multithreaded process can consume resources simultaneously. For example, top doesn’t show threads by default; you’ll need to press H to turn it on. For most of the resource monitoring tools that you’re about to see, you’ll have to do a little extra work to turn on the thread display.

线程在资源监控方面可能会引起混淆,因为多线程进程中的各个线程可以同时消耗资源。

例如,默认情况下,top不显示线程;您需要按下H键来打开线程显示。

对于即将看到的大多数资源监控工具,您需要做一些额外的工作来打开线程显示。

8.5 Introduction to Resource Monitoring(资源监测简介)

Now we’ll discuss some topics in resource monitoring, including processor (CPU) time, memory, and disk I/O. We’ll examine utilization on a systemwide scale, as well as on a per-process basis.

现在,我们将讨论资源监控中的一些主题,包括处理器(CPU)时间、内存和磁盘 I/O。

我们将检查整个系统和每个进程的利用率。

Many people touch the inner workings of the Linux kernel in the interest of improving performance. However, most Linux systems perform well under a distribution’s default settings, and you can spend days trying to tune your machine’s performance without meaningful results, especially if you don’t know what to look for. So rather than think about performance as you experiment with the tools in this chapter, think about seeing the kernel in action as it divides resources among processes.

为了提高性能,很多人都会接触 Linux 内核的内部工作原理。

然而,大多数 Linux 系统在发行版的默认设置下性能良好,你可能要花费数天时间来调整机器的性能,却得不到有意义的结果,尤其是如果你不知道要注意什么的话。

因此,在使用本章中的工具进行实验时,与其考虑性能,不如看看内核在进程间分配资源时的运行情况。

8.6 Measuring CPU Time(测量 CPU 时间)

To monitor one or more specific processes over time, use the -p option to top, with this syntax:

要在一段时间内监控一个或多个特定进程,请使用 top 的 -p 选项,语法如下:

代码语言:sh
复制
$ top -p pid1 [-p pid2 ...]

To find out how much CPU time a command uses during its lifetime, use time. Most shells have a built-in time command that doesn’t provide extensive statistics, so you’ll probably need to run /usr/bin/time. For example, to measure the CPU time used by ls, run

要想知道一条命令在其生命周期内占用了多少 CPU 时间,可以使用 time。

大多数 shell 都有一个内置的 time 命令,但并不提供大量的统计数据,所以你可能需要运行 /usr/bin/time。

例如,要测量 ls 占用的 CPU 时间,运行

代码语言:sh
复制
$ /usr/bin/time ls

After ls terminates, time should print output like that below. The key fields are in boldface:

ls 终止后,time 打印输出应如下所示。关键字段用粗体表示:

代码语言:sh
复制
0.05user 0.09system 0:00.44elapsed 31%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (125major+51minor)pagefaults 0swaps

o User time. The number of seconds that the CPU has spent running the program’s own code. On modern processors, some commands run so quickly, and therefore the CPU time is so low, that time rounds down to zero.

用户时间。CPU花费在运行程序自身代码的秒数

在现代处理器上,某些命令运行得非常快,因此CPU时间非常低,时间会被四舍五入为零。

o System time. How much time the kernel spends doing the process’s work (for example, reading files and directories).

系统时间。内核花费在执行进程工作的时间(例如,读取文件和目录)。

o Elapsed time. The total time it took to run the process from start to finish, including the time that the CPU spent doing other tasks. This number is normally not very useful for performance measurement, but subtracting the user and system time from elapsed time can give you a general idea of how long a process spends waiting for system resources.

The remainder of the output primarily details memory and I/O usage. You’ll learn more about the page fault output in 8.9 Memory.

经过的时间。从开始到结束运行进程所花费的总时间,包括CPU花费在其他任务上的时间。

这个数字通常对性能测量没有太大用处,但从经过的时间中减去用户时间和系统时间可以让你大致了解进程等待系统资源的时间。

输出的其余部分主要详细说明了内存和I/O使用情况。

你将在8.9内存中了解更多关于页面错误输出的内容。

8.7 Adjusting Process Priorities(调整流程优先级)

You can change the way the kernel schedules a process in order to give the process more or less CPU time than other processes. The kernel runs each process according to its scheduling priority, which is a number between –20 and 20, with –20 being the foremost priority. (Yes, this can be confusing.)

您可以改变内核调度进程的方式,使该进程获得比其他进程更多或更少的 CPU 时间。

内核会根据每个进程的调度优先级来运行进程,调度优先级是一个介于 -20 和 20 之间的数字,其中 -20 的优先级最高。

(是的,这可能会引起混淆)。

The ps -l command lists the current priority of a process, but it’s a little easier to see the priorities in action with the top command, as shown here:

ps -l 命令会列出进程的当前优先级,但使用top命令更容易查看优先级,如图所示:

代码语言:sh
复制
$ top
Tasks: 244 total, 2 running, 242 sleeping, 0 stopped, 0 zombie
Cpu(s): 31.7%us, 2.8%sy, 0.0%ni, 65.4%id, 0.2%wa, 0.0%hi, 0.0%si, 
0.0%st
Mem: 6137216k total, 5583560k used, 553656k free, 72008k buffers
Swap: 4135932k total, 694192k used, 3441740k free, 767640k cached
 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28883 bri 20 0 1280m 763m 32m S 58 12.7 213:00.65 chromium-
browse
1175 root 20 0 210m 43m 28m R 44 0.7 14292:35 Xorg
4022 bri 20 0 413m 201m 28m S 29 3.4 3640:13 chromiumbrowse
4029 bri 20 0 378m 206m 19m S 2 3.5 32:50.86 chromiumbrowse
3971 bri 20 0 881m 359m 32m S 2 6.0 563:06.88 chromiumbrowse
5378 bri 20 0 152m 10m 7064 S 1 0.2 24:30.21 compiz
3821 bri 20 0 312m 37m 14m S 0 0.6 29:25.57 soffice.bin
4117 bri 20 0 321m 105m 18m S 0 1.8 34:55.01 chromiumbrowse
4138 bri 20 0 331m 99m 21m S 0 1.7 121:44.19 chromiumbrowse
4274 bri 20 0 232m 60m 13m S 0 1.0 37:33.78 chromiumbrowse
4267 bri 20 0 1102m 844m 11m S 0 14.1 29:59.27 chromiumbrowse
2327 bri 20 0 301m 43m 16m S 0 0.7 109:55.65 unity-2d-shell

In the top output above, the PR (priority) column lists the kernel’s current schedule priority for the process. The higher the number, the less likely the kernel is to schedule the process if others need CPU time. The schedule priority alone does not determine the kernel’s decision to give CPU time to a process, and it changes frequently during program execution according to the amount of CPU time that the process consumes.

在上面的输出中,PR(优先级)列显示了内核对进程的当前调度优先级。

数字越高,如果其他进程需要CPU时间,内核调度该进程的可能性就越小。

调度优先级本身并不能决定内核是否将CPU时间分配给进程,并且根据进程消耗的CPU时间,在程序执行过程中频繁变化。

Next to the priority column is the nice value (NI) column, which gives a hint to the kernel’s scheduler. This is what you care about when trying to influence the kernel’s decision. The kernel adds the nice value to the current priority to determine the next time slot for the process.

在优先级列旁边是nice值(NI)列,它向内核的调度器提供了一个提示。

当您想要影响内核的决策时,这是您关心的内容。

内核将nice值添加到当前优先级,以确定进程的下一个时间片。

By default, the nice value is 0. Now, say you’re running a big computation in the background that you don’t want to bog down your interactive session. To have that process take a backseat to other processes and run only when the other tasks have nothing to do, you could change the nice value to 20 with the renice command (where pid is the process ID of the process that you want to change):

默认情况下,nice值为0。现在,假设您在后台运行一个大型计算任务,您不希望它影响您的交互会话。

为了让该进程在其他任务没有任务时才运行,并且让其他进程有更高的优先级,您可以使用renice命令将nice值更改为20(其中pid是您想要更改的进程的进程ID):

代码语言:sh
复制
$ renice 20 pid

If you’re the superuser, you can set the nice value to a negative number, but doing so is almost always a bad idea because system processes may not get enough CPU time. In fact, you probably won’t need to alter nice values much because many Linux systems have only a single user, and that user does not perform much real computation. (The nice value was much more important back when there were many users on a single machine.)

如果你是超级用户,可以将 nice 值设置为负数,但这样做几乎总是个坏主意,因为系统进程可能得不到足够的 CPU 时间。

事实上,你可能并不需要过多修改 nice 值,因为许多 Linux 系统只有一个用户,而且该用户并不执行很多实际计算。

(在一台机器上有很多用户的时候,nice 值要重要得多)。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Chapter 8. A Closer Look at Processes and Resource Utilization(第 8 章 流程和资源利用的近距离观察)
  • 8.1 Tracking Processes(追踪进程)
  • 8.2 Finding Open Files with lsof(用 lsof 查找打开的文件)
    • 8.2.1 Reading the lsof Output(读取 lsof 输出)
      • 8.2.2 Using lsof(使用 lsof)
        • 8.3.1 strace
      • 8.3.2 ltrace(追踪)
        • 8.4.1 Single-Threaded and Multithreaded Processes(单线程和多线程进程)
          • 8.4.2 Viewing Threads(查看主题)
          • 8.5 Introduction to Resource Monitoring(资源监测简介)
          • 8.6 Measuring CPU Time(测量 CPU 时间)
          • 8.7 Adjusting Process Priorities(调整流程优先级)
          相关产品与服务
          应用性能监控
          应用性能监控(Application Performance Management,APM)是一款应用性能管理平台,基于实时多语言应用探针全量采集技术,为您提供分布式性能分析和故障自检能力。APM 协助您在复杂的业务系统里快速定位性能问题,降低 MTTR(平均故障恢复时间),实时了解并追踪应用性能,提升用户体验。
          领券
          问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档