我在snakemake工作流中包含了对某些规则的benchmark
指令,其结果文件具有以下标题:
s h:m:s max_rss max_vms max_uss max_pss io_in io_out mean_load
我找到的唯一文件提到了一个“基准txt文件(它将包含一个由选项卡分隔的运行时表和MiB中的内存使用情况)”。
我可以猜到,第1列和第2列是两种显示执行规则所需时间的不同方法(以秒为单位,并转换为小时、分钟和秒)。
io_in
和io_out
很可能与磁盘读写活动有关,但它们是以什么单位测量的呢?
其他的是什么?这在什么地方有记录吗?
编辑:查看源代码
我在/snakemake/benchmark.py
中找到了以下代码,这很可能是基准数据的来源:
def _update_record(self):
"""Perform the actual measurement"""
# Memory measurements
rss, vms, uss, pss = 0, 0, 0, 0
# I/O measurements
io_in, io_out = 0, 0
# CPU seconds
cpu_seconds = 0
# Iterate over process and all children
try:
main = psutil.Process(self.pid)
this_time = time.time()
for proc in chain((main,), main.children(recursive=True)):
meminfo = proc.memory_full_info()
rss += meminfo.rss
vms += meminfo.vms
uss += meminfo.uss
pss += meminfo.pss
ioinfo = proc.io_counters()
io_in += ioinfo.read_bytes
io_out += ioinfo.write_bytes
if self.bench_record.prev_time:
cpu_seconds += proc.cpu_percent() / 100 * (
this_time - self.bench_record.prev_time)
self.bench_record.prev_time = this_time
if not self.bench_record.first_time:
self.bench_record.prev_time = this_time
rss /= 1024 * 1024
vms /= 1024 * 1024
uss /= 1024 * 1024
pss /= 1024 * 1024
io_in /= 1024 * 1024
io_out /= 1024 * 1024
except psutil.Error as e:
return
# Update benchmark record's RSS and VMS
self.bench_record.max_rss = max(self.bench_record.max_rss or 0, rss)
self.bench_record.max_vms = max(self.bench_record.max_vms or 0, vms)
self.bench_record.max_uss = max(self.bench_record.max_uss or 0, uss)
self.bench_record.max_pss = max(self.bench_record.max_pss or 0, pss)
self.bench_record.io_in = io_in
self.bench_record.io_out = io_out
self.bench_record.cpu_seconds += cpu_seconds
因此,这似乎来自psutil
提供的功能。
发布于 2017-11-09 11:56:55
当然,snakemake中的基准测试可以更好地记录下来,但是psutil是文档化的这里。
get_memory_info()
Return a tuple representing RSS (Resident Set Size) and VMS (Virtual Memory Size) in bytes.
On UNIX RSS and VMS are the same values shown by ps.
On Windows RSS and VMS refer to "Mem Usage" and "VM Size" columns of taskmgr.exe.
psutil.disk_io_counters(perdisk=False)
Return system disk I/O statistics as a namedtuple including the following attributes:
read_count: number of reads
write_count: number of writes
read_bytes: number of bytes read
write_bytes: number of bytes written
read_time: time spent reading from disk (in milliseconds)
write_time: time spent writing to disk (in milliseconds)
您找到的代码确认所有内存使用情况和IO计数都以MB (=字节* 1024 * 1024)为单位报告。
https://stackoverflow.com/questions/46813371
复制相似问题