文章/答案/技术大牛

发布

社区首页 >问答首页 >如何在C++11中将std::string转换为std::u32string？

问如何在C++11中将std::string转换为std::u32string？
EN

Stack Overflow用户

提问于 2020-02-08 13:48:46

回答 2查看 1.8K关注 0票数 0

我正在使用C++11中的Unicode &我现在无法将std::string转换为std::u32string。

我的代码如下：

#include <iostream>
#include <string>
#include <locale>
#include "unicode/unistr.h"
#include "unicode/ustream.h"

int main()
{
    constexpr char locale_name[] = "";
    setlocale( LC_ALL, locale_name );
    std::locale::global(std::locale(locale_name));
    std::ios_base::sync_with_stdio(false);
    std::wcin.imbue(std::locale());
    std::wcout.imbue(std::locale());

    std::string str="hello☺";

    std::u32string s(str.begin(),str.end());

    icu::UnicodeString ustr = icu::UnicodeString::fromUTF32(reinterpret_cast<const UChar32 *>(s.c_str()), s.size());
    std::cout << "Unicode string is: " << ustr << std::endl;

    std::cout << "Size of unicode string = " << ustr.countChar32() << std::endl;

    std::cout << "Individual characters of the string are:" << std::endl;
    for(int i=0; i < ustr.countChar32(); i++)
      std::cout << icu::UnicodeString(ustr.char32At(i)) << std::endl;

    return 0;
}

在执行输出时是：(这不是预期的)

Unicode string is: hello�������
Size of unicode string = 12
Individual characters of the string are:
h
e
l
l
o
�
�
�
�
�
�
�

请建议是否有任何ICU库功能。

c++

c++11

unicode

non-ascii-characters

icu

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-02-09 15:04:35

谢谢大家的帮助！

使用这两个链接，我找到了一些相关的功能：

我尝试使用codecvt函数，但得到了错误：

fatal error: codecvt: No such file or directory
 #include <codecvt>
                   ^
compilation terminated.

因此，我跳过了这个&在进一步的搜索中，我找到了mbrtoc32()函数，它可以工作：)

这是工作代码：

#include <iostream>
#include <string>
#include <locale>
#include "unicode/unistr.h"
#include "unicode/ustream.h"
#include <cassert>
#include <cwchar>
#include <uchar.h>

int main()
{
    constexpr char locale_name[] = "";
    setlocale( LC_ALL, locale_name );
    std::locale::global(std::locale(locale_name));
    std::ios_base::sync_with_stdio(false);
    std::wcin.imbue(std::locale());
    std::wcout.imbue(std::locale());

    std::string str;
    std::cin >> str;
    //For example, the input string is "hello☺"

    std::mbstate_t state{}; // zero-initialized to initial state
    char32_t c32;
    const char *ptr = str.c_str(), *end = str.c_str() + str.size() + 1;

    icu::UnicodeString ustr;

    while(std::size_t rc = mbrtoc32(&c32, ptr, end - ptr, &state))
    {
      icu::UnicodeString temp((UChar32)c32);
      ustr+=temp;
      assert(rc != (std::size_t)-3); // no surrogates in UTF-32
      if(rc == (std::size_t)-1) break;
      if(rc == (std::size_t)-2) break;
      ptr+=rc;
    }

    std::cout << "Unicode string is: " << ustr << std::endl;
    std::cout << "Size of unicode string = " << ustr.countChar32() << std::endl;
    std::cout << "Individual characters of the string are:" << std::endl;
    for(int i=0; i < ustr.countChar32(); i++)
      std::cout << icu::UnicodeString(ustr.char32At(i)) << std::endl;

    return 0;
}

输入hello☺的输出与预期的相同：

Unicode string is: hello☺
Size of unicode string = 7
Individual characters of the string are:
h
e
l
l
o
☺

票数 0

Stack Overflow用户

发布于 2020-02-08 16:07:54

输出是有意义的。想必你以为你是在定义一个7个字符的字符串？看看str.size()。您定义了一个12个字符的字符串！

即使您能够在程序中键入"hello☺"，这个字符串文字并不是由七个字节组成的。最后两个字符中的每一个都被展开为多个字节，因为这些字符不属于扩展的ASCII范围(0到255或-128到127)。结果是一个12字节的字符串文本，它初始化一个12个字符的string，而后者又初始化一个12个字符的u32string.你已经破坏了你想要代表的角色。

示例：字符'☺'表示为三个字节\0xE2\0x98\0xBA__。如果char是在您的系统上签名的(很可能)，那么这三个字节的值为-30、-104和-70。对char32_t的转换将这些值中的每一个提升到32位，然后将签名转换为无符号，从而生成三个值-- 4294967266__、4294967192__和4294967226__。您想要的是将这些字节连接到单个char32_t值\0x00E298BA__中。但是，您的转换没有提供(重新)组合字节的机制。

类似地，字符''由四个字节的\0xF0\0x9F\0x98\0x86__表示。它们被转换成四个32位整数，而不是单个值\0xF09F9886__。

要获得所需的结果，需要告诉编译器将字符串文字解释为7个字符。尝试以下s初始化

std::u32string s = U"hello☺";

字符串文本上的U前缀告诉编译器，每个字符代表一个UTF-32字符。这将产生所需的7-字符字符串(假设编译器和编辑器同意字符编码，我认为这是合理的)。

免费调试带来的好处:当您的输出与您预期的不一样时，请在每个阶段检查数据，以确保您的输入符合您的预期。

票数 4

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/60127405

复制

相似问题

问如何在C++11中将std::string转换为std::u32string？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在C++11中将std::string转换为std::u32string？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在C++11中将std::string转换为std::u32string？
EN