.NET 9 中引入了 RuntimeMetrics,基于 dotnet 里的 metrics 实现 System.Diagnostic.Metrics.Meter
来生成 metrics 数据,包含了 CPU、内存、GC、JIT 以及线程等信息
那我们就结合 OpenTelemetry 来看一个简单的示例,sample 引用了 OpenTelemetry.Exporter.Console
将 metrics 数据直接导出到 console
using var _ = Sdk.CreateMeterProviderBuilder()
.AddMeter("System.Runtime")
.AddConsoleExporter()
.Build();
while (true)
{
await Task.Delay(TimeSpan.FromSeconds(10));
GC.Collect();
}
运行结果如下:
Metric Name: dotnet.gc.collections, The number of garbage collections that have occurred since the process has started., Unit: {collection}, Meter: System.Runtime
(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:24:36.3639680Z] gc.heap.generation: gen2 LongSum
Value: 0
(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:24:36.3639680Z] gc.heap.generation: gen1 LongSum
Value: 0
(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:24:36.3639680Z] gc.heap.generation: gen0 LongSum
Value: 0
Metric Name: dotnet.process.memory.working_set, The number of bytes of physical memory mapped to the process context., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3779803Z, 2024-09-15T11:24:36.3640471Z] LongSumNonMonotonic
Value: 29241344
Metric Name: dotnet.gc.heap.total_allocated, The approximate number of bytes allocated on the managed GC heap since the process has started. The returned value does not include any native allocations., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3779975Z, 2024-09-15T11:24:36.3640481Z] LongSum
Value: 3427736
Metric Name: dotnet.gc.pause.time, The total amount of time paused in GC since the process has started., Unit: s, Meter: System.Runtime
(2024-09-15T11:24:26.3793018Z, 2024-09-15T11:24:36.3640951Z] DoubleSum
Value: 0
Metric Name: dotnet.jit.compiled_il.size, Count of bytes of intermediate language that have been compiled since the process has started., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3793233Z, 2024-09-15T11:24:36.3640965Z] LongSum
Value: 67761
Metric Name: dotnet.jit.compiled_methods, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: {method}, Meter: System.Runtime
(2024-09-15T11:24:26.3793381Z, 2024-09-15T11:24:36.3640975Z] LongSum
Value: 829
Metric Name: dotnet.jit.compilation.time, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: s, Meter: System.Runtime
(2024-09-15T11:24:26.3793520Z, 2024-09-15T11:24:36.3640985Z] DoubleSum
Value: 1.3528453
Metric Name: dotnet.monitor.lock_contentions, The number of times there was contention when trying to acquire a monitor lock since the process has started., Unit: {contention}, Meter: System.Runtime
(2024-09-15T11:24:26.3793612Z, 2024-09-15T11:24:36.3641010Z] LongSum
Value: 0
Metric Name: dotnet.thread_pool.thread.count, The number of thread pool threads that currently exist., Unit: {thread}, Meter: System.Runtime
(2024-09-15T11:24:26.3793740Z, 2024-09-15T11:24:36.3641018Z] LongSum
Value: 1
Metric Name: dotnet.thread_pool.work_item.count, The number of work items that the thread pool has completed since the process has started., Unit: {work_item}, Meter: System.Runtime
(2024-09-15T11:24:26.3793850Z, 2024-09-15T11:24:36.3641026Z] LongSum
Value: 2
Metric Name: dotnet.thread_pool.queue.length, The number of work items that are currently queued to be processed by the thread pool., Unit: {work_item}, Meter: System.Runtime
(2024-09-15T11:24:26.3793936Z, 2024-09-15T11:24:36.3641034Z] LongSum
Value: 0
Metric Name: dotnet.timer.count, The number of timer instances that are currently active. An active timer is registered to tick at some point in the future and has not yet been canceled., Unit: {timer}, Meter: System.Runtime
(2024-09-15T11:24:26.3794100Z, 2024-09-15T11:24:36.3641060Z] LongSumNonMonotonic
Value: 2
Metric Name: dotnet.assembly.count, The number of .NET assemblies that are currently loaded., Unit: {assembly}, Meter: System.Runtime
(2024-09-15T11:24:26.3794199Z, 2024-09-15T11:24:36.3641072Z] LongSumNonMonotonic
Value: 33
Metric Name: dotnet.process.cpu.count, The number of processors available to the process., Unit: {cpu}, Meter: System.Runtime
(2024-09-15T11:24:26.3794408Z, 2024-09-15T11:24:36.3641099Z] LongSumNonMonotonic
Value: 22
Metric Name: dotnet.process.cpu.time, CPU time used by the process., Unit: s, Meter: System.Runtime
(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:24:36.3641105Z] cpu.mode: user DoubleSum
Value: 0.15625
(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:24:36.3641105Z] cpu.mode: system DoubleSum
Value: 0
随着我们触发 GC.Collect
GC 回收的 metrics 也会发生变化
Metric Name: dotnet.gc.collections, The number of garbage collections that have occurred since the process has started., Unit: {collection}, Meter: System.Runtime
(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:25:06.3409748Z] gc.heap.generation: gen2 LongSum
Value: 3
(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:25:06.3409748Z] gc.heap.generation: gen1 LongSum
Value: 0
(2024-09-15T11:24:26.3765112Z, 2024-09-15T11:25:06.3409748Z] gc.heap.generation: gen0 LongSum
Value: 0
Metric Name: dotnet.process.memory.working_set, The number of bytes of physical memory mapped to the process context., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3779803Z, 2024-09-15T11:25:06.3409766Z] LongSumNonMonotonic
Value: 36999168
Metric Name: dotnet.gc.heap.total_allocated, The approximate number of bytes allocated on the managed GC heap since the process has started. The returned value does not include any native allocations., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3779975Z, 2024-09-15T11:25:06.3409774Z] LongSum
Value: 3602608
Metric Name: dotnet.gc.last_collection.memory.committed_size, The amount of committed virtual memory in use by the .NET GC, as observed during the latest garbage collection., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3780104Z, 2024-09-15T11:25:06.3409782Z] LongSumNonMonotonic
Value: 4530176
Metric Name: dotnet.gc.last_collection.heap.size, The managed GC heap size (including fragmentation), as observed during the latest garbage collection., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: gen0 LongSumNonMonotonic
Value: 0
(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: gen1 LongSumNonMonotonic
Value: 560
(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: gen2 LongSumNonMonotonic
Value: 466256
(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: loh LongSumNonMonotonic
Value: 2739800
(2024-09-15T11:24:26.3780292Z, 2024-09-15T11:25:06.3409791Z] gc.heap.generation: poh LongSumNonMonotonic
Value: 9232
Metric Name: dotnet.gc.last_collection.heap.fragmentation.size, The heap fragmentation, as observed during the latest garbage collection., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: gen0 LongSumNonMonotonic
Value: 0
(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: gen1 LongSumNonMonotonic
Value: 80
(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: gen2 LongSumNonMonotonic
Value: 24
(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: loh LongSumNonMonotonic
Value: 608
(2024-09-15T11:24:26.3789008Z, 2024-09-15T11:25:06.3409804Z] gc.heap.generation: poh LongSumNonMonotonic
Value: 0
Metric Name: dotnet.gc.pause.time, The total amount of time paused in GC since the process has started., Unit: s, Meter: System.Runtime
(2024-09-15T11:24:26.3793018Z, 2024-09-15T11:25:06.3409815Z] DoubleSum
Value: 0.007868
Metric Name: dotnet.jit.compiled_il.size, Count of bytes of intermediate language that have been compiled since the process has started., Unit: By, Meter: System.Runtime
(2024-09-15T11:24:26.3793233Z, 2024-09-15T11:25:06.3409821Z] LongSum
Value: 102621
Metric Name: dotnet.jit.compiled_methods, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: {method}, Meter: System.Runtime
(2024-09-15T11:24:26.3793381Z, 2024-09-15T11:25:06.3409826Z] LongSum
Value: 1196
Metric Name: dotnet.jit.compilation.time, The number of times the JIT compiler (re)compiled methods since the process has started., Unit: s, Meter: System.Runtime
(2024-09-15T11:24:26.3793520Z, 2024-09-15T11:25:06.3409832Z] DoubleSum
Value: 1.7024187
Metric Name: dotnet.monitor.lock_contentions, The number of times there was contention when trying to acquire a monitor lock since the process has started., Unit: {contention}, Meter: System.Runtime
(2024-09-15T11:24:26.3793612Z, 2024-09-15T11:25:06.3409859Z] LongSum
Value: 0
Metric Name: dotnet.thread_pool.thread.count, The number of thread pool threads that currently exist., Unit: {thread}, Meter: System.Runtime
(2024-09-15T11:24:26.3793740Z, 2024-09-15T11:25:06.3409867Z] LongSum
Value: 1
Metric Name: dotnet.thread_pool.work_item.count, The number of work items that the thread pool has completed since the process has started., Unit: {work_item}, Meter: System.Runtime
(2024-09-15T11:24:26.3793850Z, 2024-09-15T11:25:06.3409875Z] LongSum
Value: 8
Metric Name: dotnet.thread_pool.queue.length, The number of work items that are currently queued to be processed by the thread pool., Unit: {work_item}, Meter: System.Runtime
(2024-09-15T11:24:26.3793936Z, 2024-09-15T11:25:06.3409883Z] LongSum
Value: 0
Metric Name: dotnet.timer.count, The number of timer instances that are currently active. An active timer is registered to tick at some point in the future and has not yet been canceled., Unit: {timer}, Meter: System.Runtime
(2024-09-15T11:24:26.3794100Z, 2024-09-15T11:25:06.3409890Z] LongSumNonMonotonic
Value: 2
Metric Name: dotnet.assembly.count, The number of .NET assemblies that are currently loaded., Unit: {assembly}, Meter: System.Runtime
(2024-09-15T11:24:26.3794199Z, 2024-09-15T11:25:06.3409895Z] LongSumNonMonotonic
Value: 36
Metric Name: dotnet.process.cpu.count, The number of processors available to the process., Unit: {cpu}, Meter: System.Runtime
(2024-09-15T11:24:26.3794408Z, 2024-09-15T11:25:06.3409908Z] LongSumNonMonotonic
Value: 22
Metric Name: dotnet.process.cpu.time, CPU time used by the process., Unit: s, Meter: System.Runtime
(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:25:06.3409913Z] cpu.mode: user DoubleSum
Value: 0.203125
(2024-09-15T11:24:26.3794527Z, 2024-09-15T11:25:06.3409913Z] cpu.mode: system DoubleSum
Value: 0
详细的 metrics 如下:
dotnet.process.cpu.time
进程使用的 CPU 时间(Counter
)dotnet.process.memory.working_set
映射到进程上下文的物理内存字节数 (UpDownCounter
),对应 Environment.WorkingSetdotnet.gc.collections
垃圾回收的次数(Counter
)dotnet.gc.heap.generation
垃圾回收所属堆的最大代数,比如 gen0
/gen1
/gen2
dotnet.gc.heap.total_allocated
GC 堆总计大约分配的字节数(Counter
),对应 GC.GetTotalAllocatedBytes()dotnet.gc.last_collection.memory.committed_size
最近一次垃圾回收期间 GC 占用的提交内存(UpDownCounter
),对应 GCMemoryInfo.TotalCommittedBytesdotnet.gc.last_collection.heap.size
在最近一次垃圾回收期间观察到的托管GC堆大小(包括碎片)(UpDownCounter
)dotnet.gc.heap.generation
垃圾回收器托管堆代数名称 (gen0
/gen1
/gen2
/loh
/poh
)dotnet.gc.last_collection.heap.fragmentation.size
在最近的垃圾回收中观察到的堆碎片化情况(UpDownCounter
)对应 GCGenerationInfo.FragmentationAfterBytesdotnet.gc.pause.time
GC暂停的总时间(Counter
)对应 GC.GetTotalPauseDuration()dotnet.jit.compiled_il.size
已编译的中间语言字节数(Counter
)对应 JitInfo.GetCompiledILBytesdotnet.jit.compiled_methods
JIT编译器(重新)编译方法的次数(Counter
)对应 JitInfo.GetCompiledMethodCountdotnet.jit.compilation.time
JIT编译器花费在编译方法上的时间(Counter
)对应 JitInfo.GetCompilationTimedotnet.thread_pool.thread.count
当前存在的线程池线程数量(UpDownCounter
)对应 ThreadPool.ThreadCountdotnet.thread_pool.work_item.count
线程池已完成的工作项数量(Counter
)对应 ThreadPool.CompletedWorkItemCountdotnet.thread_pool.queue.length
当前排队等待线程池处理的工作项数量(UpDownCounter
)对应 ThreadPool.PendingWorkItemCountdotnet.monitor.lock_contentions
尝试获取 Monitor 锁时发生争用的次数(Counter
)对应 Monitor.LockContentionCountdotnet.timer.count
当前活动的 Timer 实例数量(UpDownCounter
)对应 Timer.ActiveCountdotnet.assembly.count
当前加载的 .NET 程序集数量(UpDownCounter
) 对应 AppDomain.GetAssemblies() 的数量dotnet.exceptions
在托管代码中抛出的异常数量(Counter
),对应 AppDomain.FirstChanceException event 的触发次数error.type
Exception 的类型,例如:System.OperationCanceledException/
Contoso.MyException结合这些信息可以比较轻松地了解当前的 CPU,memory,GC,thread 等信息,对于了解当前应用是否健康非常的有帮助,除了在进程内使用 OpenTelemetry 来导出 metrics 之外也可以使用进程外的 dotnet-counters 等诊断工具来观察。
观察 CPU 数据来观察是否有过高的 CPU 使用
观察内存和 GC 数据看是否有垃圾回收、内存泄漏以及内存碎片之类的问题
观察线程池中队列和线程数的情况来查看是否有线程池饿死(thread pool starvation)的情况
观察 lock contention 来看是否有死锁以及锁不合理的使用
总而言之, runtime metrics 使得我们的应用可以有更好的观测性,获取当前应用的状态信息更加地方便了