mblen

在头文件<stdlib.h>中定义
int mblen（const char * s，size_t n）;

确定第一个字节由s指向的多字节字符的大小（以字节为单位）。

如果s是空指针，则重置全局转换状态并确定是否使用移位序列。

除了mbtowc的转换状态不受影响之外，该函数等同于调用mbtowc（（wchar_t *）0，s，n）。

注意

每次调用mblen都会更新内部全局转换状态（类型为mbstate_t的静态对象，只有此函数已知）。如果多字节编码使用移位状态，则必须小心避免回溯或多次扫描。无论如何，多线程不应该在没有同步的情况下调用mblen：但可以使用mbrlen。

参数

s	-	指向多字节字符的指针
n	-	限制可以检查的s中的字节数

返回值

如果s不是空指针，则返回多字节字符中包含的字节数，如果s指向的第一个字节没有形成有效的多字节字符，则返回-1;如果s指向空字符，则返回0'\ 0'。

如果s是空指针，则重置其内部转换状态以表示初始移位状态，如果当前多字节编码不是状态相关的（不使用移位序列），则返回0，如果当前多字节为非零值编码是依赖于状态的（使用移位序列）。

例

#include <string.h>
#include <stdlib.h>
#include <locale.h>
#include <stdio.h>
 
// the number of characters in a multibyte string is the sum of mblen()'s
// note: the simpler approach is mbstowcs(NULL, str, sz)
size_t strlen_mb(const char* ptr)
{
    size_t result = 0;
    const char* end = ptr + strlen(ptr);
    mblen(NULL, 0); // reset the conversion state
    while(ptr < end) {
        int next = mblen(ptr, end - ptr);
        if(next == -1) {
           perror("strlen_mb");
           break;
        }
        ptr += next;
        ++result;
    }
    return result;
}
 
int main(void)
{
    setlocale(LC_ALL, "en_US.utf8");
    const char* str = "z\u00df\u6c34\U0001f34c";
    printf("The string %s consists of %zu bytes, but only %zu characters\n",
            str, strlen(str), strlen_mb(str));
}

可能的输出：

The string zß水? consists of 10 bytes, but only 4 characters

参考

C11标准（ISO / IEC 9899：2011）：
- 7.22.7.1 mblen函数（p：357）
C99标准（ISO / IEC 9899：1999）：
- 7.20.7.1 mblen函数（p：321）
C89 / C90标准（ISO / IEC 9899：1990）：
- 4.10.7.1 mblen函数

扩展内容

mbtowc	将下一个多字节字符转换为宽字符（函数）
mbrlen（C95）	返回下一个多字节字符中的字节数，给定状态（函数）

| mblen的C ++文档 |

本文档系腾讯云开发者社区成员共同维护，如有问题请联系 cloudcommunity@tencent.com

最后更新于：2017-12-18