首页
学习
活动
专区
工具
TVP
发布
精选内容/技术社群/优惠产品,尽在小程序
立即前往

std::codecvt

Defined in header <locale>

template< class InternT, class ExternT, class State > class codecvt;

std::codecvt封装字符串(包括宽字节和多字节)从一种编码到另一种编码的转换。执行的所有文件I/O操作std::basic_fstream<CharT>使用std::codecvt<CharT, char,std::mbstate_t>在溪流中注入的区域面。

二次

二次

继承图

标准库提供了四个独立的%28区域设置无关%29的专门化:

在标头中定义<locale>

*。

STD::codecvt<char,char,std::mbstate[医]t>恒等转换

STD::codecvt<char16[医]T,char,std::mbstate[医]T>自C++11%29以来UTF-16和UTF-8%28之间的转换

STD::codecvt<char32[医]T,char,std::mbstate[医]T>自C++11%29以来UTF-32和UTF-8%28之间的转换

STD::codecvt<wchar[医]T,char,std::mbstate[医]t>系统%27s本机宽度与单字节窄字符集之间的转换

此外,在C++程序中构造的每个locale对象都实现了它自己的%28 locale特定于这四种专门化的%29版本。

成员类型

Member type

Definition

intern_type

InternT

extern_type

ExternT

state_type

State

成员函数

(constructor)

constructs a new codecvt facet (public member function)

(destructor)

destructs a codecvt facet (protected member function)

out

invokes do_out (public member function)

in

invokes do_in (public member function)

unshift

invokes do_unshift (public member function)

encoding

invokes do_encoding (public member function)

always_noconv

invokes do_always_noconv (public member function)

length

invokes do_length (public member function)

max_length

invokes do_max_length (public member function)

成员对象

Member name

Type

id (static)

std::locale::id

受保护成员函数

do_out virtual

converts a string from internT to externT, such as when writing to file (virtual protected member function)

do_in virtual

converts a string from externT to internT, such as when reading from file (virtual protected member function)

do_unshift virtual

generates the termination character sequence of externT characters for incomplete conversion (virtual protected member function)

do_encoding virtual

returns the number of externT characters necessary to produce one internT character, if constant (virtual protected member function)

do_always_noconv virtual

tests if the facet encodes an identity conversion for all valid argument values (virtual protected member function)

do_length virtual

calculates the length of the externT string that would be consumed by conversion into given internT buffer (virtual protected member function)

do_max_length virtual

returns the maximum number of externT characters that could be converted into a single internT character (virtual protected member function)

继承自STD::编解码器[医]底座

Member type

Definition

enum result { ok, partial, error, noconv };

Unscoped enumeration type

Enumeration constant

Definition

ok

conversion was completed with no error

partial

not all source characters were converted

error

encountered an invalid character

noconv

no conversion required, input and output types are the same

下面的示例使用一个地区读取UTF-8文件,它在codecvt<wchar中实现UTF-8转换[医]T,char,mbstate[医]并使用std::codecvt的标准专门化之一将UTF-8字符串转换为UTF-16。

二次

代码语言:javascript
复制
#include <iostream>
#include <fstream>
#include <string>
#include <locale>
#include <iomanip>
#include <codecvt>
 
// utility wrapper to adapt locale-bound facets for wstring/wbuffer convert
template<class Facet>
struct deletable_facet : Facet
{
    template<class ...Args>
    deletable_facet(Args&& ...args) : Facet(std::forward<Args>(args)...) {}
    ~deletable_facet() {}
};
 
int main()
{
    // UTF-8 narrow multibyte encoding
    std::string data = u8"z\u00df\u6c34\U0001f34c";
                       // or u8"zß水?"
                       // or "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9f\x8d\x8c";
 
    std::ofstream("text.txt") << data;
 
    // using system-supplied locale's codecvt facet
    std::wifstream fin("text.txt");
    // reading from wifstream will use codecvt<wchar_t, char, mbstate_t>
    // this locale's codecvt converts UTF-8 to UCS4 (on systems such as Linux)
    fin.imbue(std::locale("en_US.UTF-8"));
    std::cout << "The UTF-8 file contains the following UCS4 code points: \n";
    for (wchar_t c; fin >> c; )
        std::cout << "U+" << std::hex << std::setw(4) << std::setfill('0') << c << '\n';
 
    // using standard (locale-independent) codecvt facet
    std::wstring_convert<
        deletable_facet<std::codecvt<char16_t, char, std::mbstate_t>>, char16_t> conv16;
    std::u16string str16 = conv16.from_bytes(data);
 
    std::cout << "The UTF-8 file contains the following UTF-16 code points: \n";
    for (char16_t c : str16)
        std::cout << "U+" << std::hex << std::setw(4) << std::setfill('0') << c << '\n';
}

二次

产出:

二次

代码语言:javascript
复制
The UTF-8 file contains the following UCS4 code points:
U+007a
U+00df
U+6c34
U+1f34c
The UTF-8 file contains the following UTF-16 code points:
U+007a
U+00df
U+6c34
U+d83c
U+df4c

二次

另见

Characterconversions

locale-defined multibyte(UTF-8, GB18030)

UTF-8

UTF-16

UTF-16

mbrtoc16 / c16rtomb(with C11's DR488)

codecvt<char16_t, char, mbstate_t>codecvt_utf8_utf16<char16_t>codecvt_utf8_utf16<char32_t>codecvt_utf8_utf16<wchar_t>

N/A

UCS2

c16rtomb(without C11's DR488)

codecvt_utf8<char16_t> codecvt_utf8<wchar_t>(Windows).

codecvt_utf16<char16_t> codecvt_utf16<wchar_t>(Windows).

UTF-32

mbrtoc32 / c32rtomb.

codecvt<char32_t, char, mbstate_t> codecvt_utf8<char32_t> codecvt_utf8<wchar_t>(non-Windows).

codecvt_utf16<char32_t> codecvt_utf16<wchar_t>(non-Windows).

system wide:UTF-32(non-Windows)UCS2(Windows)

mbsrtowcs / wcsrtombs use_facet<codecvt <wchar_t, char, mbstate_t>>(locale).

No

No

codecvt_base

defines character conversion errors (class template)

codecvt_byname

creates a codecvt facet for the named locale (class template)

codecvt_utf8 (C++11)(deprecated in C++17)

converts between UTF-8 and UCS2/UCS4 (class template)

codecvt_utf16 (C++11)(deprecated in C++17)

converts between UTF-16 and UCS2/UCS4 (class template)

codecvt_utf8_utf16 (C++11)(deprecated in C++17)

converts between UTF-8 and UTF-16 (class template)

代码语言:txt
复制
 © cppreference.com

在CreativeCommonsAttribution下授权-ShareAlike未移植许可v3.0。

扫码关注腾讯云开发者

领取腾讯云代金券