本期的技术解码,为您解析
编程中,内存问题的分析与定位方法
对编程语言设计来说,内存管理分为两大类:手动内存管理(manual memory management) 和垃圾回收(garbage collection). 常见的如C、C++使用手动内存管理,Java使用垃圾回收。本文主要关注手动内存管理。
GC
GC使内存管理自动化,缺点是引入了GC时不可预测的暂停(unpredictable stall),对实时性要求高的场景不适用。现代的GC实现一直朝着减小“stop-the-world"影响的方向进行优化。
有GC机制的编程语言不代表彻底告别了内存泄漏(此时内存泄漏的含义与手动内存管理语言稍有不同)。当短生命周期对象被长生命周期对象一直持有时,短生命周期对象实际不再被调用但又得不到GC,即为内存泄漏。这类泄漏在Android应用开发中普遍存在,尤其要注意匿名内部类的使用。可以用LeakCanary工具进行内存泄漏检查LeakCanary(https://square.github.io/leakcanary/).
对于手动内存管理,引用计数(reference counting)是常用的避免内存泄漏的手段。实际上,引用计数可以解决两大问题:
引用计数存在一个缺点,无法解决循环引用(reference cycles)的问题。当存在循环引用时,计数始终 > 0,对象得不到释放。
C语言和C++都可以使用引用计数,但只有引用计数与RAII(Resource Acquisition Is Initialization)的结合,才使得手动内存管理的便捷程度接近于GC. C语言必须手动调用hold, release等方法来对引用计数做增减和释放内存。如果某些代码路径特别是错误处理上漏了一个release,即导致内存泄漏。而RAII可以通过对象的构造和析构来自动增减引用计数,即使出现exception的场景,也可以保证正确的引用计数。
RAII本身可以独立使用,可以用于非内存对象的场景,比如文件描述符。GC的一个缺点是无法及时自动释放非内存资源,例如Java的finalizer并不等于C++的析构,finalizer可以作为最后的兜底策略,不能作为关闭文件描述符的第一选择。
Rust也是使用引用计数 + RAII来解决内存安全问题。Rust的语言设计使得简单的循环引用场景在编译时报错,降低循环引用出现的可能性,但不能彻底避免循环引用。
常见的内存问题有:
有一些常见的误区:
void test() { try { int *p = new int[1ULL << 50U]; std::cout << p << '\n'; delete[] p; } catch (const std::bad_alloc &e) { std::cout << e.what() << '\n'; }
int *p = new(std::nothrow) int[1ULL << 50U]; if (p == nullptr) { std::cout << "Allocation returned nullptr\n"; } delete[] p; }
另外,free(NULL)和delete nullptr都是安全的,是否判断非空指针再delete是代码风格问题。
内存操作错误会导致undefined behavior,可能让程序逻辑异常,最明显的是crash。通过crash来分析、定位和解决内存相关bug,是一种亡羊补牢的做法,如果能够在程序灰度过程中及时解决,犹未晚矣。
NDK开发是Android应用开发的重要组成部分,尤其是包含音视频功能的应用。Java、Kotlin等语言异常crash时,往往有清晰的backtrace,理清crash现场相对容易。而面对native的crash以及上报系统上报的一堆寄存器信息等,一些开发同学可能觉得无从下手。下面以Android平台为例,简述native crash的分析工具、分析方法。
Android平台App出现native crash,少数情况是发生了不可恢复的错误,主动abort (SIGABRT),多数情况是由内存问题导致的。
主动abort()的场景,一般会给出abort()的原因。例如Android的日志打印LOG_FATAL(),会先打印出log message,再abort(). 应用一般不调用LOG_FATAL(), 偶尔可以看到Android系统因为一些异常情况而LOG_FATAL(). 如果crash上报系统有崩溃现场完善的日志,通过日志分析原因是比较容易的。
因不同的内存问题导致的crash,呈现不同的现场,例如:
crash上报系统通常会上报如下信息:
日志中可能同时包含backtrace和寄存器信息,例如:
图中的一些关键信息:
void g_system_thread_set_name (const gchar *name) { #if defined(HAVE_PTHREAD_SETNAME_NP_WITHOUT_TID) pthread_setname_np (name); /* on OS X and iOS */ #elif defined(HAVE_PTHREAD_SETNAME_NP_WITH_TID) pthread_setname_np (pthread_self (), name); /* on Linux and Solaris */ #elif defined(HAVE_PTHREAD_SETNAME_NP_WITH_TID_AND_ARG) pthread_setname_np (pthread_self (), "%s", (gchar *) name); /* on NetBSD */ #elif defined(HAVE_PTHREAD_SET_NAME_NP) pthread_set_name_np (pthread_self (), name); /* on FreeBSD, DragonFlyBSD, OpenBSD */ #endif }
不同信息用不同工具来分析:
如果在编译构建环节没有保存带符号动态库,而是crash发生之后再重新生成动态库,新生成的动态库不一定与上线发布的版本匹配。能够准确还原调用栈,这一步尤其重要。无符号分析和调试也是一种手段,但复杂性显著增加。
另外,要特别注意是否关掉了unwind table. 例如编译webrtc release会关闭unwind table,然后导致crash时backtrace不完整
build/config/compiler/BUILD.gn:
if (!is_nacl) { if (exclude_unwind_tables) { cflags += [ "-fno-unwind-tables", "-fno-asynchronous-unwind-tables", ] defines += [ "NO_UNWIND_TABLES" ] } else { cflags += [ "-funwind-tables" ] } }
webrtc为何关闭unwind table在compiler.gni里有说明:
# Exclude unwind tables for official builds as unwinding can be done from stack# dumps produced by Crashpad at a later time "offline" in the crash server.# For unofficial (e.g. development) builds and non-Chrome branded (e.g. Cronet# which doesn't use Crashpad, crbug.com/479283) builds it's useful to be able# to unwind at runtime.exclude_unwind_tables = is_official_build || (is_chromecast && !is_cast_desktop_build && !is_debug && !cast_is_debug && !is_fuchsia)
关于符号的一些说明:
set(CMAKE_C_VISIBILITY_PRESET hidden) set(CMAKE_CXX_VISIBILITY_PRESET hidden)
还原调用栈
第一步通常是用addr2line还原调用栈。ndk提供了简化工具ndk-stack,可以直接输入日志输出还原的调用栈。例如
adb logcat > /tmp/foo.txt$NDK/ndk-stack -sym $PROJECT_PATH/obj/local/armeabi-v7a -dump foo.txt
使用addr2line,常用参数有:(https://linux.die.net/man/1/addr2line)
$ c++filt -n _Z1fv f()
还原调用栈之后,结合日志信息,有些崩溃可以立刻定位出原因,比如对象空指针;有些则原因不明,或者看起来像是发生了“不可能的崩溃”,需要进一步分析。
addr2line定位出代码行之后,一行代码可能包含多次解引用,可能包含多个条件语句判断,不能确定具体是哪个操作触发了crash(另一方面我们可以反思,应当避免把一行代码写的过于复杂)。通过反编译可以找出触发crash的汇编指令。
可以使用objdump进行反编译。objdump的功能很多,用于反编译时,常用参数如下:(https://linux.die.net/man/1/objdump)
例如:
aarch64-linux-android-objdump -D -C libvlc.so > dump
在objdump输出的文件中,查找pc地址。例如,前面crash在0x91c30的例子:
0000000000091c04 <cricket::Connection::ToString() const>: 91c04: a9bb7bfd stp x29, x30, [sp,#-80]! 91c08: f9000bfc str x28, [sp,#16] 91c0c: a9025ff8 stp x24, x23, [sp,#32] 91c10: a90357f6 stp x22, x21, [sp,#48] 91c14: a9044ff4 stp x20, x19, [sp,#64] 91c18: 910003fd mov x29, sp 91c1c: d104c3ff sub sp, sp, #0x130 91c20: f9400009 ldr x9, [x0] 91c24: aa0003f4 mov x20, x0 91c28: aa0803f3 mov x19, x8 91c2c: f9400929 ldr x9, [x9,#16] 91c30: d63f0120 blr x9
可以看到,是 blr x9 时crash. crash时的寄存器信息,fault address等都可以一一对应起来。
反编译之后,需要一些汇编的基础知识来阅读分析汇编代码:
因为这段汇编逻辑不复杂,已经可以定位出是Connection对象野指针,从虚函数表读取了错误的虚函数地址,再去执行虚函数时crash。
有时候代码逻辑复杂,能够定位出crash时的指令,但不清楚怎么和C++代码对应的,可以借助调试器来分析和验证猜想。
代码调试通常只需要单步调试,但在crash分析场景,单指令调试更加方便。单指令调试结合打印寄存器值,可以快速找出汇编指令和C++的对应关系。例如,通过调试可以确认,x9是哪个虚函数的地址。
单指令调试:
通过还原调用栈、反编译、调试验证等,可以理清楚崩溃现场,找到crash的直接原因。但是问题的根本原因可能还未暴露。比如,从虚函数表加载的虚函数地址异常,可以推出Connection对象异常,但问题未必出在Connection. 需要把思路拓宽,避免紧盯着crash的一行代码而找不到根本原因。考虑如下方向
调试器在溯因过程中也非常有用。gdb和lldb都支持设置watchpoint. watchpoint可以用来分析越界读写数据:当发生了对某些地址的读写行为时,暂停程序。基本用法见下表GDB to LLDB command map:(https://lldb.llvm.org/use/map.html)
watchpoint实现原理见:Hardware Breakpoint (or watchpoint) usage in Linux Kernel
Android Studio中调试ndk代码见:https://developer.android.com/studio/debug
从崩溃分析定位和解决内存问题是亡羊补牢,而在开发过程中,我们应当做到未雨绸缪。一些工具可以方便的进行内存问题检查,与持续集成相结合,可以有效减少crash问题,提高软件质量。
一些基础手段可以用来验证是否有内存泄漏。
TA ("Tree Allocator") is a wrapper around malloc() and related functions,adding features like automatically freeing sub-trees of memory allocations ifa parent allocation is freed.
Generally, the idea is that every TA allocation can have a parent (indicatedby the ta_parent argument in allocation function calls). If a parent is freed,its child allocations are automatically freed as well. It is also allowed tofree a child before the parent, or to move a child to another parent withta_set_parent().
It also provides a bunch of convenience macros and debugging facilities.
The TA functions are documented in the implementation files (ta.c, ta_utils.c).
TA is intended to be useable as library independent from mpv. It doesn'tdepend on anything mpv specific.
Valgrind
Valgrind是Linux平台常用的内存检查工具。用Valgrind启动应用,Valgrind相当于一个虚拟机,跟踪记录应用的内存申请释放等操作。Valgrind工具集包含多个工具,最常用的是memcheck. memcheck能够检查如下问题:
Valgrind memcheck的优点是不需要重新编译应用,缺点是运行速度慢,比正常运行的应用慢20~30倍,并且占用更多内存。因此,Valgrind不适用于强实时性应用,如播放器。
另外,massif是heap profiler工具,可以量化各个模块的内存占用,以便有针对性的进行内存优化。
详细文档见:http://valgrind.org/docs/manual/index.html
Android早期版本支持Valgrind,AOSP源码中包含Valgrind的编译构建。但是Android 8.0以后,Valgrind基本无法运行。而且运行Valgrind需要root权限,因此很难找到一个可以运行Valgrind的Android设备。下面简述一下在Android上使用Valgrind的基本流程。
export TOOLCHAINS_PATH=$ANDROID_NDK/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin export AR=$TOOLCHAINS_PATH/arm-linux-androideabi-ar export LD=$TOOLCHAINS_PATH/arm-linux-androideabi-ld export CC=$TOOLCHAINS_PATH/arm-linux-androideabi-gcc export RANLIB=$TOOLCHAINS_PATH/arm-linux-androideabi-ranlib export CPPFLAGS="--sysroot=$ANDROID_NDK/platforms/android-9/arch-arm" export CFLAGS="--sysroot=$ANDROID_NDK/platforms/android-9/arch-arm" ./configure --prefix=/data/local/tmp/Inst \ --host=armv7-unknown-linux --target=armv7-unknown-linux \ --with-tmpdir=/sdcard make -j8 && make -j8 install DESTDIR="$PWD/Inst"
adb push Inst/data/local/tmp/Inst/ /data/local/tmp/
进行内存检查时,Valgrind能够给出异常的代码行和调用栈,前提是应用程序包含调试符号信息
#!/system/bin/sh PACKAGE="com.example.helloworld" # Callgrind tool #VGPARAMS='-v --error-limit=no --trace-children=yes --log-file=/sdcard/valgrind.log.%p --tool=callgrind --callgrind-out-file=/sdcard/callgrind.out.%p' # Memcheck tool #VGPARAMS='-v --error-limit=no --trace-children=yes --tool=memcheck --leak-check=full --show-reachable=yes' VGPARAMS='-v --error-limit=no --trace-children=yes --track-origins=yes --log-file=/sdcard/valgrind/log.%p --tool=memcheck --leak-check=full --show-reachable=yes' export TMPDIR=/data/data/$PACKAGE exec /data/local/tmp/Inst/bin/valgrind $VGPARAMS $*
在设备上执行:
PACKAGE="com.example.helloworld" setprop "wrap.$PACKAGE" 'logwrapper /data/local/tmp/start_valgrind.sh' getprop "wrap.$PACKAGE" #make sure setprop success am force-stop $PACKAGE am start -a android.intent.action.MAIN -n $PACKAGE/.MainActivity
setprop生效之后,通过am start方式和UI界面操作方式都可以启动应用。启用应用时,会先执行 start_valgrind.sh,start_valgrind.sh 执行Valgrind,Valgrind再启动要测试的应用程序。耐心等待应用程序启动,然后进行常规操作测试。
输出结果
程序执行过程中,Valgrind会把部分检查结果(如未初始化,越界访问等)输出到 /sdcard/valgrind/ 目录下。但只有程序完全退出后,Valgrind才会给出内存泄漏汇总的结果。
Android上,可以通过kill -TERM让程序退出。避免使用kill -KILL,kill -KILL会让程序会立刻终止,Valgrind无法输出结果。
在Linux系统上对demo程序做检查
#include <iostream>int main(int argc, char *argv[]){ char *p = new char[2]; p[2] = 123; if (p[0] > 'a') std::cout << __LINE__ << std::endl; else std::cout << __LINE__ << std::endl; return 0;}
Valgrind输出如下:
==1157== Memcheck, a memory error detector==1157== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.==1157== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info==1157== Command: ./foo==1157== ==1157== Invalid write of size 1==1157== at 0x40088B: main (foo.cpp:7)==1157== Address 0x5ab6c82 is 0 bytes after a block of size 2 alloc'd==1157== at 0x4C2E80F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)==1157== by 0x40087E: main (foo.cpp:5)==1157== ==1157== Conditional jump or move depends on uninitialised value(s)==1157== at 0x400897: main (foo.cpp:9)==1157== Uninitialised value was created by a heap allocation==1157== at 0x4C2E80F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)==1157== by 0x40087E: main (foo.cpp:5)==1157== 12==1157== ==1157== HEAP SUMMARY:==1157== in use at exit: 72,706 bytes in 2 blocks==1157== total heap usage: 3 allocs, 1 frees, 73,730 bytes allocated==1157== ==1157== 2 bytes in 1 blocks are definitely lost in loss record 1 of 2==1157== at 0x4C2E80F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)==1157== by 0x40087E: main (foo.cpp:5)==1157== ==1157== LEAK SUMMARY:==1157== definitely lost: 2 bytes in 1 blocks==1157== indirectly lost: 0 bytes in 0 blocks==1157== possibly lost: 0 bytes in 0 blocks==1157== still reachable: 72,704 bytes in 1 blocks==1157== suppressed: 0 bytes in 0 blocks==1157== Reachable blocks (those to which a pointer was found) are not shown.==1157== To see them, rerun with: --leak-check=full --show-leak-kinds=all==1157== ==1157== For counts of detected and suppressed errors, rerun with: -v==1157== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
可以看到,p[2] 越界访问、p[0] 未初始化、内存泄漏,三个bug都能被定位到。
Sanitizers包含如下几种工具
https://github.com/google/sanitizers
实现原理见:《AddressSanitizer: A Fast Address Sanity Checker》
平台支持情况:
功能:
注意: 检查内存泄漏的功能LeakSanitizer当前只支持Linux和macOS,且macOS上需要另外安装llvm toolchain,Xcode自带的不支持。Linux上默认开启LeakSanitizer,macOS上需要环境变量控制开启:ASAN_OPTIONS=detect_leaks=1
与Valgrind相比,Address Sanitizers对程序执行速度影响小,大约是正常执行速度的1/2.
LLVM 3.1,GCC 4.8开始支持Address Sanitizer.
编译参数,配置cflags, cxxflags, link flags: -fsanitize=address. 为了输出结果更具可读性,还需要配置下编译器优化级别、开启调试符号、跳过strip等。
Xcode内开启asan的方法:
Android上使用asan的基本步骤https://developer.android.com/ndk/guides/asan
target_compile_options(${TARGET} PUBLIC -fsanitize=address -fno-omit-frame-pointer) set_target_properties(${TARGET} PROPERTIES LINK_FLAGS -fsanitize=address)
#!/system/bin/sh HERE="$(cd "$(dirname "$0")" && pwd)" export ASAN_OPTIONS=log_to_syslog=false,allow_user_segv_handler=1 ASAN_LIB=$(ls $HERE/libclang_rt.asan-*-android.so) if [ -f "$HERE/libc++_shared.so" ]; then # Workaround for https://github.com/android-ndk/ndk/issues/988. export LD_PRELOAD="$ASAN_LIB $HERE/libc++_shared.so" else export LD_PRELOAD="$ASAN_LIB" fi "$@"
整体结构如下:
<project root>└── app └── src └── main ├── jniLibs │ ├── arm64-v8a │ │ └── libclang_rt.asan-aarch64-android.so │ ├── armeabi-v7a │ │ └── libclang_rt.asan-arm-android.so │ ├── x86 │ │ └── libclang_rt.asan-i686-android.so │ └── x86_64 │ └── libclang_rt.asan-x86_64-android.so └── resources └── lib ├── arm64-v8a │ └── wrap.sh ├── armeabi-v7a │ └── wrap.sh ├── x86 │ └── wrap.sh └── x86_64 └── wrap.sh
示例代码:
#include <stdlib.h>int main() { char *x = (char*)malloc(10 * sizeof(char*)); free(x); return x[5];}
编译: /tmp$ gcc -fsanitize=address foo.c -o foo -g -O0
执行:
/tmp$ ./foo===================================================================6145==ERROR: AddressSanitizer: heap-use-after-free on address 0x60700000dfb5 at pc 0x0000004007d4 bp 0x7ffcf79be420 sp 0x7ffcf79be410READ of size 1 at 0x60700000dfb5 thread T0 #0 0x4007d3 in main /tmp/foo.c:5 #1 0x7f84f366482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #2 0x4006a8 in _start (/tmp/foo+0x4006a8)0x60700000dfb5 is located 5 bytes inside of 80-byte region [0x60700000dfb0,0x60700000e000)freed by thread T0 here: #0 0x7f84f3aa62ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) #1 0x400797 in main /tmp/foo.c:4 #2 0x7f84f366482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)previously allocated by thread T0 here: #0 0x7f84f3aa6602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602) #1 0x400787 in main /tmp/foo.c:3 #2 0x7f84f366482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)SUMMARY: AddressSanitizer: heap-use-after-free /tmp/foo.c:5 main...
ThreadSanitizer (tsan) 用来检查多线程竞争问题。clang 3.2版本、gcc 4.8版本开始支持ThreadSanitizer.
编译参数,配置cflags, cxxflags, link flags: -fsanitize=thread.
示例代码:
#include <pthread.h>#include <stdio.h>int Global;void *Thread1(void *x) { Global++; return NULL;}void *Thread2(void *x) { Global--; return NULL;}int main() { pthread_t t[2]; pthread_create(&t[0], NULL, Thread1, NULL); pthread_create(&t[1], NULL, Thread2, NULL); pthread_join(t[0], NULL); pthread_join(t[1], NULL);}
编译
/tmp$ gcc -fsanitize=thread -o foo foo.c -g -O0
执行
/tmp$ ./foo==================WARNING: ThreadSanitizer: data race (pid=6761) Read of size 4 at 0x00000060107c by thread T2: #0 Thread2 /tmp/foo.c:12 (foo+0x000000400998) #1 <null> <null> (libtsan.so.0+0x0000000230d9) Previous write of size 4 at 0x00000060107c by thread T1: #0 Thread1 /tmp/foo.c:7 (foo+0x00000040095b) #1 <null> <null> (libtsan.so.0+0x0000000230d9) Location is global 'Global' of size 4 at 0x00000060107c (foo+0x00000060107c) Thread T2 (tid=6764, running) created by main thread at: #0 pthread_create <null> (libtsan.so.0+0x000000027577) #1 main /tmp/foo.c:19 (foo+0x000000400a23) Thread T1 (tid=6763, finished) created by main thread at: #0 pthread_create <null> (libtsan.so.0+0x000000027577) #1 main /tmp/foo.c:18 (foo+0x000000400a04)SUMMARY: ThreadSanitizer: data race /tmp/foo.c:12 Thread2==================ThreadSanitizer: reported 1 warnings
MemorySanitizer (MSan)用来检查对未初始化内存的访问,clang 3.3版本开始支持。
编译配置:-fsanitize=memory.
示例代码:
#include <stdio.h>#include <stdlib.h>#include <string.h>int main(int argc, char *argv[]){ int n; if (n > 0) printf("%d\n", n); return 0;}
编译
/tmp$ clang -fsanitize=memory -fPIE -pie -fno-omit-frame-pointer -g bar.c -o bar
执行
/tmp$ ./bar==201433==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x494fc7 in main /tmp/bar.c:7:7 #1 0x7f9e2ba4b0b2 in __libc_start_main /build/glibc-YbNSs7/glibc-2.31/csu/../csu/libc-start.c:308:16 #2 0x41c26d in _start (/tmp/bar+0x41c26d)SUMMARY: MemorySanitizer: use-of-uninitialized-value /tmp/bar.c:7:7 in mainExiting
多媒体音视频领域涉及对不可信输入做解析、处理IO错误、进行汇编优化等等,会出现各种内存操作错误。FFmpeg为了减小这类错误的发生,做了以下工作:
$ ./configure --help |less... --toolchain=NAME set tool defaults according to NAME (gcc-asan, clang-asan, gcc-msan, clang-msan, gcc-tsan, clang-tsan, gcc-usan, clang-usan, valgrind-massif, valgrind-memcheck, msvc, icl, gcov, llvm-cov, hardened)...
可以看到,通过简单配置configure option,可以开启gcc/clang的几个sanitizer工具,valgrind memcheck工具等。结合out-of-tree编译和fate测试FFmpeg Automated Testing Environment,每次代码改动可以执行多种编译配置,用多种工具对单元测试、系统测试进行检查。
看下面这段代码:
1 #include <iostream> 2 #include <memory> 3 4 class Foo { 5 public: 6 ~Foo() { 7 std::cout << __PRETTY_FUNCTION__ << '\n'; 8 } 9 10 char buf[256]; 11 }; 12 13 void test() { 14 auto a = std::make_shared<Foo>(); 15 std::weak_ptr<Foo> b(a); 16 a = nullptr; 17 b.lock(); 18 } 19 20 int main() { 21 test(); 22 return 0; 23 }
先看内存分配的size:
(lldb) bt * thread #1, name = 'foo', stop reason = breakpoint 1.2 * frame #0: 0x00007ffff7af7260 libc.so.6`__GI___libc_malloc(bytes=272) at malloc.c:3023:1 frame #1: 0x00007ffff7e60c29 libstdc++.so.6`operator new(unsigned long) + 25 frame #2: 0x0000000000401d44 foo`__gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2> >::allocate(this=0x00007fffffffdd50, __n=1, (null)=0x0000000000000000) at new_allocator.h:114:27 frame #3: 0x0000000000401cb4 foo`std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2> > >::allocate(__a=0x00007fffffffdd50, __n=1)2> >&, unsigned long) at alloc_traits.h:444:20 frame #4: 0x0000000000401aba foo`std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2> > > std::__allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__a=0x00007fffffffdd50)2> > >(std::allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2> >&) at allocated_ptr.h:97:21 frame #5: 0x0000000000401950 foo`std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<Foo, std::allocator<Foo> >(this=0x00007fffffffdea8, __p=0x00007fffffffdea0, __a=_Sp_alloc_shared_tag<std::allocator<Foo> > @ 0x00007fffffffdd68) at shared_ptr_base.h:677:19 frame #6: 0x00000000004018f0 foo`std::__shared_ptr<Foo, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<Foo> >(this=0x00007fffffffdea0, __tag=_Sp_alloc_shared_tag<std::allocator<Foo> > @ 0x00007fffffffdd98) at shared_ptr_base.h:1344:14 frame #7: 0x00000000004018a8 foo`std::shared_ptr<Foo>::shared_ptr<std::allocator<Foo> >(this=0x00007fffffffdea0, __tag=_Sp_alloc_shared_tag<std::allocator<Foo> > @ 0x00007fffffffddc8) at shared_ptr.h:359:4 frame #8: 0x000000000040182b foo`std::shared_ptr<Foo> std::allocate_shared<Foo, std::allocator<Foo> >(__a=0x00007fffffffde40) at shared_ptr.h:701:14 frame #9: 0x0000000000401494 foo`std::shared_ptr<Foo> std::make_shared<Foo>() at shared_ptr.h:717:14 frame #10: 0x0000000000401271 foo`test() at foo.cpp:14:12
可以看到,std::make_shared<Foo>()申请分配内存size为272,比sizeof(Foo)多了16字节。
再看何时执行析构:
(lldb) c Process 206788 resumingProcess 206788 stopped * thread #1, name = 'foo', stop reason = breakpoint 3.1 frame #0: 0x000000000040218c foo`Foo::~Foo(this=0x0000000000418ec0) at foo.cpp:7:17 4 class Foo { 5 public: 6 ~Foo() { -> 7 std::cout << __PRETTY_FUNCTION__ << '\n'; 8 } 9 10 char buf[256]; (lldb) bt * thread #1, name = 'foo', stop reason = breakpoint 3.1 * frame #0: 0x000000000040218c foo`Foo::~Foo(this=0x0000000000418ec0) at foo.cpp:7:17 frame #1: 0x0000000000402179 foo`void __gnu_cxx::new_allocator<Foo>::destroy<Foo>(this=0x0000000000418ec0, __p=0x0000000000418ec0) at new_allocator.h:153:10 frame #2: 0x0000000000402110 foo`void std::allocator_traits<std::allocator<Foo> >::destroy<Foo>(__a=0x0000000000418ec0, __p=0x0000000000418ec0) at alloc_traits.h:497:8 frame #3: 0x0000000000401edf foo`std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2>::_M_dispose(this=0x0000000000418eb0) at shared_ptr_base.h:557:2 frame #4: 0x00000000004016dc foo`std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(this=0x0000000000418eb0) at shared_ptr_base.h:155:6 frame #5: 0x000000000040168a foo`std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count(this=0x00007fffffffddf8) at shared_ptr_base.h:730:11 frame #6: 0x000000000040252e foo`std::__shared_ptr<Foo, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr(this=0x00007fffffffddf0) at shared_ptr_base.h:1169:31 frame #7: 0x0000000000402423 foo`std::__shared_ptr<Foo, (__gnu_cxx::_Lock_policy)2>::operator=(this=0x00007fffffffdea0, __r=0x00007fffffffde80)2>&&) at shared_ptr_base.h:1265:2 frame #8: 0x0000000000401554 foo`std::shared_ptr<Foo>::operator=(this=0x00007fffffffdea0, __r=nullptr) at shared_ptr.h:335:27 frame #9: 0x0000000000401298 foo`test() at foo.cpp:16:5
a = nullptr 时执行了析构。
再看何时free:
* thread #1, name = 'foo', stop reason = breakpoint 2.2 frame #0: 0x00007ffff7af7850 libc.so.6`__GI___libc_free(mem=0x0000000000418eb0) at malloc.c:3087:1(lldb) bt * thread #1, name = 'foo', stop reason = breakpoint 2.2 * frame #0: 0x00007ffff7af7850 libc.so.6`__GI___libc_free(mem=0x0000000000418eb0) at malloc.c:3087:1 frame #1: 0x00000000004022f0 foo`__gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2> >::deallocate(this=0x00007fffffffddb0, __p=0x0000000000418eb0, (null)=1)2>*, unsigned long) at new_allocator.h:128:2 frame #2: 0x00000000004022c8 foo`std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2> > >::deallocate(__a=0x00007fffffffddb0, __p=0x0000000000418eb0, __n=1)2> >&, std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) at alloc_traits.h:470:13 frame #3: 0x0000000000401c44 foo`std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr(this=0x00007fffffffdda0) at allocated_ptr.h:73:4 frame #4: 0x0000000000401f45 foo`std::_Sp_counted_ptr_inplace<Foo, std::allocator<Foo>, (__gnu_cxx::_Lock_policy)2>::_M_destroy(this=0x0000000000418eb0) at shared_ptr_base.h:567:7 frame #5: 0x00000000004017e9 foo`std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_weak_release(this=0x0000000000418eb0) at shared_ptr_base.h:194:6 frame #6: 0x000000000040179a foo`std::__weak_count<(__gnu_cxx::_Lock_policy)2>::~__weak_count(this=0x00007fffffffde98) at shared_ptr_base.h:823:11 frame #7: 0x000000000040175e foo`std::__weak_ptr<Foo, (__gnu_cxx::_Lock_policy)2>::~__weak_ptr(this=0x00007fffffffde90) at shared_ptr_base.h:1596:29 frame #8: 0x00000000004015e8 foo`std::weak_ptr<Foo>::~weak_ptr(this=0x00007fffffffde90) at shared_ptr_base.h:351:11 frame #9: 0x00000000004012c4 foo`test() at foo.cpp:18:1
可以看到,直到weak_ptr out of scope时才执行free(). 所以make_shared和weak_ptr 一块使用的场景需要谨慎处理。
参考文献
1. Scott Meyers, "Effective Modern C++", 2014.
2. Bjarne Stroustrup, "The C++ Programming Language, 4th Edition", 2013.
3. "Arm Architecture Reference Manual Armv8, for Armv8-A architecture profile", https://developer.arm.com/documentation/ddi0487/latest/.
4. Prasad Krishnan, "Hardware Breakpoint (or watchpoint) usage in Linux Kernel".
5. Konstantin Serebryany, Derek Bruening, etc., "AddressSanitizer: A Fast Address Sanity Checker".