KSCrash源码分析

原创

用户2297838

修改于 2018-12-10 11:25:36

5K2

0x01 安装过程

1.1 抛砖引玉

KSCrashInstallationStandard* installation = [KSCrashInstallationStandard sharedInstance];
installation.url = [NSURL URLWithString:@"http://put.your.url.here"];
[installation install];

以上代码是KSCrash的安装代码，[KSCrashInstallationStandard init]底层对自身的属性进行赋值，然后，进入[KSCrashInstallationStandard install]，这里会调用[KSCrash init]方法来设置要初始化的Crash监控器，然后调用[KSCrash install]，这里会根据前面的设置好要启动的监控器来启动哪些监控器。

- (void) install
{
    KSCrash* handler = [KSCrash sharedInstance];
    @synchronized(handler)
    {
        g_crashHandlerData = self.crashHandlerData;
        handler.onCrash = crashCallback;
        [handler install];
    }
}

在安装监控器的前面，还设置了crash的回调(onCrash)，最终，会调用void kscm_setActiveMonitors(KSCrashMonitorType monitorTypes)来启动监控器，monitorTypes就是我们一开始设定好的g_monitoring，在KSCrashMonitor里面，设定好了监控器数组g_monitors，通过数组里面的monitorType来跟我们设定好的g_monitors进行运算，运算出的结果来决定我们是否安装该monitorType对应的监控器，下面，分析各个监控器的安装过程。

1.2 Mach kernel exceptions

Mach内核异常，由枚举KSCrashMonitorTypeMachException表示，由KSCrashMonitor_MachException来实现相关方法，如果添加至全局枚举g_monitoring里面，则需要开启，通过通用Apistatic void setEnabled(bool isEnabled)最后调用installExceptionHandler。

static bool installExceptionHandler()
{
    ...

//    备份异常端口
    KSLOG_DEBUG("Backing up original exception ports.");
    kr = task_get_exception_ports(thisTask,
                                  mask,
                                  g_previousExceptionPorts.masks,
                                  &g_previousExceptionPorts.count,
                                  g_previousExceptionPorts.ports,
                                  g_previousExceptionPorts.behaviors,
                                  g_previousExceptionPorts.flavors);
    
    if(kr != KERN_SUCCESS)
    {
        KSLOG_ERROR("task_get_exception_ports: %s", mach_error_string(kr));
        goto failed;
    }

    if(g_exceptionPort == MACH_PORT_NULL)
    {
//          分配新端口并赋予接收权限
        KSLOG_DEBUG("Allocating new port with receive rights.");
        kr = mach_port_allocate(thisTask,
                                MACH_PORT_RIGHT_RECEIVE,
                                &g_exceptionPort);
        if(kr != KERN_SUCCESS)
        {
            KSLOG_ERROR("mach_port_allocate: %s", mach_error_string(kr));
            goto failed;
        }
        
//          给端口添加发送权限
        KSLOG_DEBUG("Adding send rights to port.");
        kr = mach_port_insert_right(thisTask,
                                    g_exceptionPort,
                                    g_exceptionPort,
                                    MACH_MSG_TYPE_MAKE_SEND);
        if(kr != KERN_SUCCESS)
        {
            KSLOG_ERROR("mach_port_insert_right: %s", mach_error_string(kr));
            goto failed;
        }
    }
// 将端口设置为接受异常的端口
    KSLOG_DEBUG("Installing port as exception handler.");
    kr = task_set_exception_ports(thisTask,
                                  mask,
                                  g_exceptionPort,
                                  EXCEPTION_DEFAULT,
                                  THREAD_STATE_NONE);
    if(kr != KERN_SUCCESS)
    {
        KSLOG_ERROR("task_set_exception_ports: %s", mach_error_string(kr));
        goto failed;
    }

//     创建辅助异常线程
    KSLOG_DEBUG("Creating secondary exception thread (suspended).");
//    创建一个分离线程
    pthread_attr_init(&attr);
    attributes_created = true;
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
    error = pthread_create(&g_secondaryPThread,
                           &attr,
                           &handleExceptions,
                           kThreadSecondary);
    if(error != 0)
    {
        KSLOG_ERROR("pthread_create_suspended_np: %s", strerror(error));
        goto failed;
    }
//     获取分离线程的线程id
    g_secondaryMachThread = pthread_mach_thread_np(g_secondaryPThread);
    ksmc_addReservedThread(g_secondaryMachThread);

//    创建主要异常线程。
    KSLOG_DEBUG("Creating primary exception thread.");
    error = pthread_create(&g_primaryPThread,
                           &attr,
                           &handleExceptions,
                           kThreadPrimary);
    if(error != 0)
    {
        KSLOG_ERROR("pthread_create: %s", strerror(error));
        goto failed;
    }
    pthread_attr_destroy(&attr);
    g_primaryMachThread = pthread_mach_thread_np(g_primaryPThread);
    ksmc_addReservedThread(g_primaryMachThread);

    KSLOG_DEBUG("Mach exception handler installed.");
    return true;

···
}

然后，使用分离线程，执行以下方法

static void* handleExceptions(void* const userData)
{
    ...
    for(;;)
    {
        KSLOG_DEBUG("Waiting for mach exception");

        // 等待异常触发
        kern_return_t kr = mach_msg(&exceptionMessage.header,
                                    MACH_RCV_MSG,
                                    0,
                                    sizeof(exceptionMessage),
                                    g_exceptionPort,
                                    MACH_MSG_TIMEOUT_NONE,
                                    MACH_PORT_NULL);
        if(kr == KERN_SUCCESS)
        {
            break;
        }

        // Loop and try again on failure.
        KSLOG_ERROR("mach_msg: %s", mach_error_string(kr));
    }

   ...
}

1.3 Fatal signals

Fatal signals在代码中，以KSCrashMonitorTypeSignal代表是否开启，在KSCrashMonitor_Signal里面实现相关实现，通用通过通用APIstatic void setEnabled(bool isEnabled)，最后调用的是installSignalHandler

1.4 C++ exceptions

在代码里面，以枚举KSCrashMonitorTypeCPPException代表C++异常，在类KSCrashMonitor_CPPException里面实现相关代码。在通用APIstatic void setEnabled(bool isEnabled)里面通过std::set_terminate(terminate_handler)设置回调函数。

1.5 Objective-C exceptions

在代码里面，以KSCrashMonitorTypeNSException表示枚举值，在KSCrashMonitor_NSException里面实现具体代码，在通用APIstatic void setEnabled(bool isEnabled)里面通过void NSSetUncaughtExceptionHandler(NSUncaughtExceptionHandler * _Nullable)设置回调

1.6 Main thread deadlock

Main thread deadlock是监控主线程死锁异常，在代码里面以枚举KSCrashMonitorTypeMainThreadDeadlock表示，在类KSCrashMonitor_Deadlock里面实现相关代码，在static void setEnabled(bool isEnabled)里面通过调用KSCrashDeadlockMonitor的初始化来安装死锁监控。

1.7 Custom crashes

自定义Crash，在代码里面用KSCrashMonitorTypeUserReported表示，在类KSCrashMonitor_User作具体实现，这个可以自定义crash，跟上面不一样的是，自定义的crash没有标准的crash时机，需要自己定义，也就是，遇到某些特发情况，可以收集信息，然后塞进KSCrashMonitor_User这里面，做后续处理。

0x02 运行过程

2.1 捕获

2.1.1 Mach kernel exceptions

在Mach中，异常是通过内核的基础设施——消息传递机制处理的。异常由出错的线程或任务(通过msg_send())抛出，然后由一个处理程序(msg_recv())捕捉。处理程序可以处理异常，也可以清除异常(即将异常标记为完成并继续)，还可以决定终止线程。

Mach异常处理模型和其他的异常处理模型不同，其他模型的异常处理程序运行在出错的线程的上下文中，而Mach的异常处理程序在不同的上下文运行异常处理程序，出错的线程向预先制定好的异常端口发送消息，然后等待应答。每一个任务都可以注册一个异常端口，这个异常端口会对同一个任务中的所有线程起效。此外，单个线程还可以通过thread_set_exception_prots注册自己的异常端口。

所以Mach kernel exceptions中，使用mach_task_self获取当前任务进程，因为Mach异常其实是一个消息转发的异常，所以需要消息接收权限，在初始化异常端口的时候就赋予了mach_port_allocate(thisTask,MACH_PORT_RIGHT_RECEIVE,&g_exceptionPort，然而后面还赋予改端口的MACH_MSG_TYPE_MAKE_SEND，这是后面的task_set_exception_ports要求这个权限，然后task_set_exception_ports将这个端口设置为目标任务的异常端口。

至此，已经捕获了异常，但是没有做任何处理，为此，需要使用mach_msg在异常端口创建一个活动监听者。异常处理可以有同一个程序的另外一个线程来完成。也可以有另外一个程序实现异常处理。launched注册的进程的异常端口就是这么做。所以在后面，分别起了两个分离线程g_secondaryMachThread、g_primaryPThread来等待异常触发。

for(;;)
    {
        KSLOG_DEBUG("Waiting for mach exception");
	// 消息循环，一直阻塞，直到收到一条消息，而且必须是一条异常消息。
    // 其他消息也不会到达异常端口。
        kern_return_t kr = mach_msg(&exceptionMessage.header,
                                    MACH_RCV_MSG,
                                    0,
                                    sizeof(exceptionMessage),
                                    g_exceptionPort,
                                    MACH_MSG_TIMEOUT_NONE,
                                    MACH_PORT_NULL);
        if(kr == KERN_SUCCESS)
        {
            break;
        }

        // Loop and try again on failure.
        KSLOG_ERROR("mach_msg: %s", mach_error_string(kr));
    }

Mach的等待异常的过程如下图：

2.1.2 Fatal signals

Mach已经通过异常机制提供了底层的陷阱处理，而BSD则在异常机制之上构建了信号处理机制。硬件产生的信号被Mach捕捉，然后转换为对应的UNIX信号。为了维护一个统一的机制，操作系统和用户尝试的信号首先被转换为Mach异常，然后再转换为信号(Signals)，如下图所示：

可以看出，跟我们自定义的Mach异常捕获不一样的是在于捕获到Mach异常的处理上。

当BSD进程(用户态进程)被bsdinit_task()函数启动时，会设置一个名为ux_handle的Mach内核线程。而ux_handle，与上面的Mach异常捕捉基本类似，只是他处理的是讲Mach异常转换为信号。

硬件产生的信号始于处理器陷阱。处理器陷阱与平台有关。ux_exception负责将陷阱转换为信号。为了处理机器相关的情况，ux_exception会调用machine_exception首先尝试处理机器陷阱。如果这个函数无法转换信号，ux_exception则处理一般情况。

如果信号不是由硬件产生的，那么这个信号来源于两个API调用：kill或pthread_kill。这两个函数分别向进程发送信号。

综上，信号可以看做是对硬件异常跟软件异常的封装。硬件软件的错误对应了相应的信号，在KSCrash中，对一下信号进行了注册回调。

static const int g_fatalSignals[] =
{
    SIGABRT,
    SIGBUS,
    SIGFPE,
    SIGILL,
    SIGPIPE,
    SIGSEGV,
    SIGSYS,
    SIGTRAP,
};

2.1.3 C++ exceptions

C++ exceptions使用系统封装好的函数std::set_terminate(CPPExceptionTerminate)来设置回调。

2.1.4 Objective-C exceptions

Objective-C exceptions使用了NSSetUncaughtExceptionHandler系统回调来调用。需要注意的是，这里可能会出现覆盖注册的问题。

2.1.5 Main thread deadlock

首先，在子线程执行runMonitor方法，然后执行watchdogPulse方法，在watchdogPulse里面通过dispatch_async到主线程来复位标志位，来鉴别是否发生死锁。

2.2 记录

2.2.1 流程

采集：数据的采集，从Crash发生就开始，主要是采集系统信息，以及crash信息，crash的信息主要是包括堆栈地址，原因等。流程：分别依次从report、binary_images、process，system，crash这些字段的流程来记录，下面，详细说下符号还原。

符号采集

NSException中，直接获取callStackReturnAddresses地址堆栈。在Mach异常中，堆栈来源是寄存器的地址，首先会获取当前pc寄存器的值地址，然后符号还原，然后再获取lr指针，然后进行符号还原，然后再获取当前的fp指针，符号还原，然后不断的重复递归fp指针，还原符号的操作，知道递归到当前地址为0或者前置帧为空 Signal异常中，堆栈获取跟Mach相同。

符号还原

符号还原的对象是地址，符号还原的核心代码在下面：首先，通过imageIndexContainingAddress方法来获取当前的传入地址所在的Image的index，怎么获取Image的index?

bool ksdl_dladdr(const uintptr_t address, Dl_info* const info)
{
    info->dli_fname = NULL;
    info->dli_fbase = NULL;
    info->dli_sname = NULL;
    info->dli_saddr = NULL;

    const uint32_t idx = imageIndexContainingAddress(address);
    if(idx == UINT_MAX)
    {
        return false;
    }
    const struct mach_header* header = _dyld_get_image_header(idx);
    const uintptr_t imageVMAddrSlide = (uintptr_t)_dyld_get_image_vmaddr_slide(idx);
    const uintptr_t addressWithSlide = address - imageVMAddrSlide;
    const uintptr_t segmentBase = segmentBaseOfImageIndex(idx) + imageVMAddrSlide;
    if(segmentBase == 0)
    {
        return false;
    }

    info->dli_fname = _dyld_get_image_name(idx);
    info->dli_fbase = (void*)header;

    // Find symbol tables and get whichever symbol is closest to the address.
    const STRUCT_NLIST* bestMatch = NULL;
    uintptr_t bestDistance = ULONG_MAX;
    uintptr_t cmdPtr = firstCmdAfterHeader(header);
    if(cmdPtr == 0)
    {
        return false;
    }
    for(uint32_t iCmd = 0; iCmd < header->ncmds; iCmd++)
    {
        const struct load_command* loadCmd = (struct load_command*)cmdPtr;
        if(loadCmd->cmd == LC_SYMTAB)
        {
            const struct symtab_command* symtabCmd = (struct symtab_command*)cmdPtr;
            const STRUCT_NLIST* symbolTable = (STRUCT_NLIST*)(segmentBase + symtabCmd->symoff);
            const uintptr_t stringTable = segmentBase + symtabCmd->stroff;

            for(uint32_t iSym = 0; iSym < symtabCmd->nsyms; iSym++)
            {
                // If n_value is 0, the symbol refers to an external object.
                if(symbolTable[iSym].n_value != 0)
                {
                    uintptr_t symbolBase = symbolTable[iSym].n_value;
                    uintptr_t currentDistance = addressWithSlide - symbolBase;
                    if((addressWithSlide >= symbolBase) &&
                       (currentDistance <= bestDistance))
                    {
                        bestMatch = symbolTable + iSym;
                        bestDistance = currentDistance;
                    }
                }
            }
            if(bestMatch != NULL)
            {
                info->dli_saddr = (void*)(bestMatch->n_value + imageVMAddrSlide);
                if(bestMatch->n_desc == 16)
                {
                    // This image has been stripped. The name is meaningless, and
                    // almost certainly resolves to "_mh_execute_header"
                    info->dli_sname = NULL;
                }
                else
                {
                    info->dli_sname = (char*)((intptr_t)stringTable + (intptr_t)bestMatch->n_un.n_strx);
                    if(*info->dli_sname == '_')
                    {
                        info->dli_sname++;
                    }
                }
                break;
            }
        }
        cmdPtr += loadCmd->cmdsize;
    }
    
    return true;
}

首先，通过传入地址减去地址偏移，得到地址在Image中的位置，暂且称之为处理过的地址，然后遍历当前加载的所有Image，然后在Image内部，在遍历所有的LC，比对地址是否在当前LC的的地址范围里面，具体：遍历每个Image，然后获取每个LC，然后比对当前处理过的地址是否大于VMAdress地址并且小于VM Adress 加上VM Size，如果符合条件，就是我们要找的Image。

回到ksdl_dladdr，获取到地址所在的Image，再获取segment base address of the specified image，同样，需要遍历当前Image的LC，然后获取到当前Image的LC_SEGMENT(__LINKEDIT)，解析下__LINKEDIT，__LINKEDIT动态库链接器需要使用的信息，包括重定位信息(例如符号表，字符串表)，绑定信息，懒加载信息等，获取到LC_SEGMENT(__LINKEDIT)则返回VM_Address的值减去File Offset的值。然后用返回值加上实际的内存偏移。其实这里应该是求的当前符号表、字符串表的基地址，公式: 实际位置 = 当前__LINKEDIT在MachO的基地址 - 当前__LINKEDIT的文件偏移 + 实际的偏移量(这里已经包括了文件偏移)。

然后，同样利用遍历LC来获取当前的符号表信息，然后遍历符号表：然后用逐步靠近目标地址的方法，来获取最佳匹配的地址。

获取到了最加的符号地址匹配，然后求出对应的字符串，一下是单个符号的结构体，所以，我们利用n_strx 也就是String Table Index就可以获取到字符串了，至此，单个地址还原完成，其他的以此类推。

struct nlist_64 {
    union {
        uint32_t  n_strx; /* index into the string table */
    } n_un;
    uint8_t n_type;        /* type flag, see below */
    uint8_t n_sect;        /* section number or NO_SECT */
    uint16_t n_desc;       /* see <mach-o/stab.h> */
    uint64_t n_value;      /* value of this symbol (or stab offset) */
};

整个捕获流程，如下图所示

0x03 参考资料

《深入解析Mac OS X & iOS操作系统》

《iOS逆向应用与安全》

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

其他

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

其他

登录后参与评论

0 条评论

热度

KSCrash源码分析

KSCrash源码分析

0x01 安装过程

1.1 抛砖引玉

1.2 Mach kernel exceptions

1.3 Fatal signals

1.4 C++ exceptions

1.5 Objective-C exceptions

1.6 Main thread deadlock

1.7 Custom crashes

0x02 运行过程

2.1 捕获

2.1.1 Mach kernel exceptions

2.1.2 Fatal signals

2.1.3 C++ exceptions

2.1.4 Objective-C exceptions

2.1.5 Main thread deadlock

2.2 记录

2.2.1 流程

符号采集

符号还原

0x03 参考资料

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐