blocks|key|1590387|text|试试这个：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1590388|preg_match_all('/./u',+$text,+$array);|code-block|syntax|javascript|1590389|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

Try this:

<pre><code>preg_match_all('/./u', $text, $array);
</code></pre>

blocks|key|50264|text|您可以将'u‘修饰符与PCRE正则表达式一起使用；参见Pattern+Modifiers+(引号)：|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|50265|50266|+u+(PCRE8)|blockquote|50267|50268|此修饰符打开与Perl不兼容的PCRE的附加功能。模式字符串被视为UTF-8。此修饰符在Unix上的PHP4.1.0或更高版本中可用，在win32上的PHP4.2.3中可用。从PHP+4.3.5开始检查UTF-8模式的有效性。|50269|50270|例如，考虑以下代码：|50271|header('Content-type:+text/html;+charset=UTF-8');++//+So+the+browser+doesn't+make+our+lives+harder
$str+=+"abc+文字化け,+efg";

$results+=+array();
preg_match_all('/./',+$str,+$results);
var_dump($results[0]);|code-block|syntax|javascript|50272|你会得到一个不可用的结果：|50273|array
++0+=>+string+'a'+(length=1)
++1+=>+string+'b'+(length=1)
++2+=>+string+'c'+(length=1)
++3+=>+string+'+'+(length=1)
++4+=>+string+'�'+(length=1)
++5+=>+string+'�'+(length=1)
++6+=>+string+'�'+(length=1)
++7+=>+string+'�'+(length=1)
++8+=>+string+'�'+(length=1)
++9+=>+string+'�'+(length=1)
++10+=>+string+'�'+(length=1)
++11+=>+string+'�'+(length=1)
++12+=>+string+'�'+(length=1)
++13+=>+string+'�'+(length=1)
++14+=>+string+'�'+(length=1)
++15+=>+string+'�'+(length=1)
++16+=>+string+','+(length=1)
++17+=>+string+'+'+(length=1)
++18+=>+string+'e'+(length=1)
++19+=>+string+'f'+(length=1)
++20+=>+string+'g'+(length=1)|50274|但是，使用下面的代码：|50275|header('Content-type:+text/html;+charset=UTF-8');++//+So+the+browser+doesn't+make+our+lives+harder
$str+=+"abc+文字化け,+efg";

$results+=+array();
preg_match_all('/./u',+$str,+$results);
var_dump($results[0]);|50276|(注意regex末尾的'u‘)|50277|你得到了你想要的：|50278|array
++0+=>+string+'a'+(length=1)
++1+=>+string+'b'+(length=1)
++2+=>+string+'c'+(length=1)
++3+=>+string+'+'+(length=1)
++4+=>+string+'文'+(length=3)
++5+=>+string+'字'+(length=3)
++6+=>+string+'化'+(length=3)
++7+=>+string+'け'+(length=3)
++8+=>+string+','+(length=1)
++9+=>+string+'+'+(length=1)
++10+=>+string+'e'+(length=1)
++11+=>+string+'f'+(length=1)
++12+=>+string+'g'+(length=1)|50279|希望这能有所帮助:-)|50280|entityMap|0|LINK|mutability|MUTABLE|url|http://php.net/manual/en/reference.pcre.pattern.modifiers.php^0|R|H|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|1G|8|@]|9|@$A|1H|B|1I|1|1J]]|C|$]]|$1|D|3|-4|5|6|7|1K|8|@]|9|@]|C|$]]|$1|E|3|F|5|G|7|1L|8|@]|9|@]|C|$]]|$1|H|3|-4|5|6|7|1M|8|@]|9|@]|C|$]]|$1|I|3|J|5|6|7|1N|8|@]|9|@]|C|$]]|$1|K|3|-4|5|6|7|1O|8|@]|9|@]|C|$]]|$1|L|3|M|5|6|7|1P|8|@]|9|@]|C|$]]|$1|N|3|O|5|P|7|1Q|8|@]|9|@]|C|$Q|R]]|$1|S|3|T|5|6|7|1R|8|@]|9|@]|C|$]]|$1|U|3|V|5|P|7|1S|8|@]|9|@]|C|$Q|R]]|$1|W|3|X|5|6|7|1T|8|@]|9|@]|C|$]]|$1|Y|3|Z|5|P|7|1U|8|@]|9|@]|C|$Q|R]]|$1|10|3|11|5|6|7|1V|8|@]|9|@]|C|$]]|$1|12|3|13|5|6|7|1W|8|@]|9|@]|C|$]]|$1|14|3|15|5|P|7|1X|8|@]|9|@]|C|$Q|R]]|$1|16|3|17|5|6|7|1Y|8|@]|9|@]|C|$]]|$1|18|3|-4|5|6|7|1Z|8|@]|9|@]|C|$]]]|19|$1A|$5|1B|1C|1D|C|$1E|1F]]]]

You could use the 'u' modifier with PCRE regex ; see <a href="http://php.net/manual/en/reference.pcre.pattern.modifiers.php" rel="noreferrer">Pattern Modifiers</a> (quoting) :

<blockquote>
 u (PCRE8)
 
 This modifier turns on additional
 functionality of PCRE that is
 incompatible with Perl. Pattern
 strings are treated as UTF-8. This
 modifier is available from PHP 4.1.0
 or greater on Unix and from PHP 4.2.3
 on win32. UTF-8 validity of the
 pattern is checked since PHP 4.3.5.
</blockquote>

For instance, considering this code :

<pre><code>header('Content-type: text/html; charset=UTF-8'); // So the browser doesn't make our lives harder
$str = "abc 文字化け, efg";

$results = array();
preg_match_all('/./', $str, $results);
var_dump($results[0]);
</code></pre>

You'll get an unusable result:

<pre><code>array
 0 =&gt; string 'a' (length=1)
 1 =&gt; string 'b' (length=1)
 2 =&gt; string 'c' (length=1)
 3 =&gt; string ' ' (length=1)
 4 =&gt; string '�' (length=1)
 5 =&gt; string '�' (length=1)
 6 =&gt; string '�' (length=1)
 7 =&gt; string '�' (length=1)
 8 =&gt; string '�' (length=1)
 9 =&gt; string '�' (length=1)
 10 =&gt; string '�' (length=1)
 11 =&gt; string '�' (length=1)
 12 =&gt; string '�' (length=1)
 13 =&gt; string '�' (length=1)
 14 =&gt; string '�' (length=1)
 15 =&gt; string '�' (length=1)
 16 =&gt; string ',' (length=1)
 17 =&gt; string ' ' (length=1)
 18 =&gt; string 'e' (length=1)
 19 =&gt; string 'f' (length=1)
 20 =&gt; string 'g' (length=1)
</code></pre>

But, with this code :

<pre><code>header('Content-type: text/html; charset=UTF-8'); // So the browser doesn't make our lives harder
$str = "abc 文字化け, efg";

$results = array();
preg_match_all('/./u', $str, $results);
var_dump($results[0]);
</code></pre>

(Notice the 'u' at the end of the regex)

You get what you want :

<pre><code>array
 0 =&gt; string 'a' (length=1)
 1 =&gt; string 'b' (length=1)
 2 =&gt; string 'c' (length=1)
 3 =&gt; string ' ' (length=1)
 4 =&gt; string '文' (length=3)
 5 =&gt; string '字' (length=3)
 6 =&gt; string '化' (length=3)
 7 =&gt; string 'け' (length=3)
 8 =&gt; string ',' (length=1)
 9 =&gt; string ' ' (length=1)
 10 =&gt; string 'e' (length=1)
 11 =&gt; string 'f' (length=1)
 12 =&gt; string 'g' (length=1)
</code></pre>

Hope this helps :-)

blocks|key|1590440|text|如果由于某些原因，正则表达式的方法不能满足您的需求。我曾经写过Zend_Locale_UTF8，它已经被废弃了，但如果你决定自己做，它可能会对你有所帮助。|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1590441|特别是，看看Zend_Locale_UTF8_PHP5_String类，它读入Unicode字符串，并使用它将它们拆分成单个字符(显然可能由多个字节组成)。|1590442|svn编辑：为了方便起见，我刚刚重新启动了ZF+browser，所以我复制了一些重要的方法：|1590443|/**
+*+Returns+the+UTF-8+code+sequence+as+an+array+for+any+given+$string.
+*
+*+@access+protected
+*+@param+string%7Cinteger+$string
+*+@return+array
+*/
protected+function+_decode(+$string+)+{

++++$string+++++=+(string)+$string;
++++$length+++++=+strlen($string);
++++$sequence+++=+array();

++++for+(+$i=0;+$i<$length;+)+{
++++++++$bytes++++++=+$this->_characterBytes($string,+$i);
++++++++$ord++++++++=+$this->_ord($string,+$bytes,+$i);

++++++++if+(+$ord+!==+false+)
++++++++++++$sequence[]+=+$ord;

++++++++if+(+$bytes+===+false+)
++++++++++++$i%2B%2B;
++++++++else
++++++++++++$i++%2B=+$bytes;
++++}

++++return+$sequence;

}

/**
+*+Returns+the+UTF-8+code+of+a+character.
+*
+*+@see+http://en.wikipedia.org/wiki/UTF-8#Description
+*+@access+protected
+*+@param+string+$string
+*+@param+integer+$bytes
+*+@param+integer+$position
+*+@return+integer
+*/
protected+function+_ord(+&$string,+$bytes+=+null,+$pos=0+)
{
++++if+(+is_null($bytes)+)
++++++++$bytes+=+$this->_characterBytes($string);

++++if+(+strlen($string)+>=+$bytes+)+{

++++++++switch+(+$bytes+)+{
++++++++++++case+1:
++++++++++++++++return+ord($string[$pos]);
++++++++++++++++break;

++++++++++++case+2:
++++++++++++++++return++(+(ord($string[$pos])+++&+0x1f)+<<+6+)+%2B
++++++++++++++++++++++++(+(ord($string[$pos%2B1])+&+0x3f)+);
++++++++++++++++break;

++++++++++++case+3:
++++++++++++++++return++(+(ord($string[$pos])+++&+0xf)++<<+12+)+%2B+
++++++++++++++++++++++++(+(ord($string[$pos%2B1])+&+0x3f)+<<+6+)+%2B
++++++++++++++++++++++++(+(ord($string[$pos%2B2])+&+0x3f)+);
++++++++++++++++break;

++++++++++++case+4:
++++++++++++++++return++(+(ord($string[$pos])+++&+0x7)++<<+18+)+%2B+
++++++++++++++++++++++++(+(ord($string[$pos%2B1])+&+0x3f)+<<+12+)+%2B+
++++++++++++++++++++++++(+(ord($string[$pos%2B1])+&+0x3f)+<<+6+)+%2B
++++++++++++++++++++++++(+(ord($string[$pos%2B2])+&+0x3f)+);
++++++++++++++++break;

++++++++++++case+0:
++++++++++++default:
++++++++++++++++return+false;
++++++++}
++++}

++++return+false;
}
/**
+*+Returns+the+number+of+bytes+of+the+$position-th+character.
+*
+*+@see+http://en.wikipedia.org/wiki/UTF-8#Description
+*+@access+protected
+*+@param+string+$string
+*+@param+integer+$position
+*/
protected+function+_characterBytes(+&$string,+$position+=+0+)+{
++++$char+++++++=+$string[$position];
++++$charVal++++=+ord($char);

++++if+(+($charVal+&+0x80)+===+0+)
++++++++return+1;

++++elseif+(+($charVal+&+0xe0)+===+0xc0+)
++++++++return+2;

++++elseif+(+($charVal+&+0xf0)+===+0xe0+)
++++++++return+3;

++++elseif+(+($charVal+&+0xf8)+===+0xf0)
++++++++return+4;
++++/*
++++elseif+(+($charVal+&+0xfe)+===+0xf8+)
++++++++return+5;
++++*/

++++return+false;
}|code-block|syntax|javascript|1590444|entityMap|0|LINK|mutability|MUTABLE|url|http://framework.zend.com/code/browse/~raw,r=1301/Zend_Framework/trunk/incubator/library/Zend/Locale/UTF8/PHP5/String.php^0|V|G|0|6|S|6|S|0|0|0|0^^$0|@$1|2|3|4|5|6|7|W|8|@$9|X|A|Y|B|C]]|D|@]|E|$]]|$1|F|3|G|5|6|7|Z|8|@$9|10|A|11|B|C]]|D|@$9|12|A|13|1|14]]|E|$]]|$1|H|3|I|5|6|7|15|8|@]|D|@]|E|$]]|$1|J|3|K|5|L|7|16|8|@]|D|@]|E|$M|N]]|$1|O|3|-4|5|6|7|17|8|@]|D|@]|E|$]]]|P|$Q|$5|R|S|T|E|$U|V]]]]

If for some reason the regex way isn't enough for you. I once wrote the <code>Zend_Locale_UTF8</code> which is abandoned but might be helping you if you decide to do it on your own.

In particular have a look at the class <a href="http://framework.zend.com/code/browse/~raw,r=1301/Zend_Framework/trunk/incubator/library/Zend/Locale/UTF8/PHP5/String.php" rel="nofollow noreferrer"><code>Zend_Locale_UTF8_PHP5_String</code></a> which reads in Unicode strings and to work with them splits them up into single chars(which may consist out of multiple bytes obviously).

EDIT:
I just relaized that ZF's svn-browser is down so I copied the important methods for convenience:

<pre><code>/**
 * Returns the UTF-8 code sequence as an array for any given $string.
 *
 * @access protected
 * @param string|integer $string
 * @return array
 */
protected function _decode( $string ) {

 $string = (string) $string;
 $length = strlen($string);
 $sequence = array();

 for ( $i=0; $i&lt;$length; ) {
 $bytes = $this-&gt;_characterBytes($string, $i);
 $ord = $this-&gt;_ord($string, $bytes, $i);

 if ( $ord !== false )
 $sequence[] = $ord;

 if ( $bytes === false )
 $i++;
 else
 $i += $bytes;
 }

 return $sequence;

}

/**
 * Returns the UTF-8 code of a character.
 *
 * @see http://en.wikipedia.org/wiki/UTF-8#Description
 * @access protected
 * @param string $string
 * @param integer $bytes
 * @param integer $position
 * @return integer
 */
protected function _ord( &amp;$string, $bytes = null, $pos=0 )
{
 if ( is_null($bytes) )
 $bytes = $this-&gt;_characterBytes($string);

 if ( strlen($string) &gt;= $bytes ) {

 switch ( $bytes ) {
 case 1:
 return ord($string[$pos]);
 break;

 case 2:
 return ( (ord($string[$pos]) &amp; 0x1f) &lt;&lt; 6 ) +
 ( (ord($string[$pos+1]) &amp; 0x3f) );
 break;

 case 3:
 return ( (ord($string[$pos]) &amp; 0xf) &lt;&lt; 12 ) + 
 ( (ord($string[$pos+1]) &amp; 0x3f) &lt;&lt; 6 ) +
 ( (ord($string[$pos+2]) &amp; 0x3f) );
 break;

 case 4:
 return ( (ord($string[$pos]) &amp; 0x7) &lt;&lt; 18 ) + 
 ( (ord($string[$pos+1]) &amp; 0x3f) &lt;&lt; 12 ) + 
 ( (ord($string[$pos+1]) &amp; 0x3f) &lt;&lt; 6 ) +
 ( (ord($string[$pos+2]) &amp; 0x3f) );
 break;

 case 0:
 default:
 return false;
 }
 }

 return false;
}
/**
 * Returns the number of bytes of the $position-th character.
 *
 * @see http://en.wikipedia.org/wiki/UTF-8#Description
 * @access protected
 * @param string $string
 * @param integer $position
 */
protected function _characterBytes( &amp;$string, $position = 0 ) {
 $char = $string[$position];
 $charVal = ord($char);

 if ( ($charVal &amp; 0x80) === 0 )
 return 1;

 elseif ( ($charVal &amp; 0xe0) === 0xc0 )
 return 2;

 elseif ( ($charVal &amp; 0xf0) === 0xe0 )
 return 3;

 elseif ( ($charVal &amp; 0xf8) === 0xf0)
 return 4;
 /*
 elseif ( ($charVal &amp; 0xfe) === 0xf8 )
 return 5;
 */

 return false;
}
</code></pre>

blocks|key|1590453|text|我能够使用mb_*编写一个解决方案，包括一次UTF-16来回之旅，这可能是一次愚蠢的尝试，目的是加快字符串索引速度：|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1590454|$japanese2+=+mb_convert_encoding($japanese,+"UTF-16",+"UTF-8");
$length+=+mb_strlen($japanese2,+"UTF-16");
for($i=0;+$i<$length;+$i%2B%2B)+{
++++$char+=+mb_substr($japanese2,+$i,+1,+"UTF-16");
++++$utf8+=+mb_convert_encoding($char,+"UTF-8",+"UTF-16");
++++print+$utf8+.+"\n";
}|code-block|syntax|javascript|1590455|我最好避免使用mb_internal_encoding，而只是在每次mb_*调用时指定所有内容。我确信我最终会使用preg解决方案。|1590456|entityMap^0|5|4|0|0|7|K|Y|4|1L|4|0^^$0|@$1|2|3|4|5|6|7|O|8|@$9|P|A|Q|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|R|8|@]|D|@]|E|$I|J]]|$1|K|3|L|5|6|7|S|8|@$9|T|A|U|B|C]|$9|V|A|W|B|C]|$9|X|A|Y|B|C]]|D|@]|E|$]]|$1|M|3|-4|5|6|7|Z|8|@]|D|@]|E|$]]]|N|$]]

I was able to write a solution using <code>mb_*</code>, including a trip to UTF-16 and back in a probably silly attempt to speed up string indexing:

<pre><code>$japanese2 = mb_convert_encoding($japanese, "UTF-16", "UTF-8");
$length = mb_strlen($japanese2, "UTF-16");
for($i=0; $i&lt;$length; $i++) {
 $char = mb_substr($japanese2, $i, 1, "UTF-16");
 $utf8 = mb_convert_encoding($char, "UTF-8", "UTF-16");
 print $utf8 . "\n";
}
</code></pre>

I had better luck avoiding <code>mb_internal_encoding</code> and just specifying everything at each <code>mb_*</code> call. I'm sure I'll wind up using the <code>preg</code> solution.

blocks|key|290821|text|比preg_match_all稍微简单一点|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|290822|preg_split('//u',+$str,+-1,+PREG_SPLIT_NO_EMPTY)|code-block|syntax|javascript|290823|这将返回一个一维数组的字符。不需要matches对象。|290824|entityMap^0|1|E|0|0|0^^$0|@$1|2|3|4|5|6|7|O|8|@$9|P|A|Q|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|R|8|@]|D|@]|E|$I|J]]|$1|K|3|L|5|6|7|S|8|@]|D|@]|E|$]]|$1|M|3|-4|5|6|7|T|8|@]|D|@]|E|$]]]|N|$]]

Slightly simpler than <code>preg_match_all</code>:

<pre><code>preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY)
</code></pre>

This gives you back a 1-dimensional array of characters. No need for a matches object.

blocks|key|50428|text|使用长度拆分的最好方法:我刚刚更改了laravel+str_limit()函数：|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|50429|++++public+static+function+split_text($text,+$limit+=+100,+$end+=+'')
{
++++$width=mb_strwidth($text,+'UTF-8');
++++if+($width+<=+$limit)+{
++++++++return+$text;
++++}
++++$res=[];
++++for($i=0;$i<=$width;$i=$i%2B$limit){
++++++++$res[]=rtrim(mb_strimwidth($text,+$i,+$limit,+'',+'UTF-8')).$end;
++++}
+++++return+$res;
}|code-block|syntax|javascript|50430|entityMap^0|Q|B|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@$9|N|A|O|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|P|8|@]|D|@]|E|$I|J]]|$1|K|3|-4|5|6|7|Q|8|@]|D|@]|E|$]]]|L|$]]

the best way for split with length: I just changed laravel <code>str_limit()</code> function:

<pre><code> public static function split_text($text, $limit = 100, $end = '')
{
 $width=mb_strwidth($text, 'UTF-8');
 if ($width &lt;= $limit) {
 return $text;
 }
 $res=[];
 for($i=0;$i&lt;=$width;$i=$i+$limit){
 $res[]=rtrim(mb_strimwidth($text, $i, $limit, '', 'UTF-8')).$end;
 }
 return $res;
}
</code></pre>

blocks|key|1454634|text|function+str_split_unicode($str,+$l+=+0)+{
++++if+($l+>+0)+{
++++++++$ret+=+array();
++++++++$len+=+mb_strlen($str,+"UTF-8");
++++++++for+($i+=+0;+$i+<+$len;+$i+%2B=+$l)+{
++++++++++++$ret[]+=+mb_substr($str,+$i,+$l,+"UTF-8");
++++++++}
++++++++return+$ret;
++++}
++++return+preg_split("//u",+$str,+-1,+PREG_SPLIT_NO_EMPTY);
}
var_dump(str_split_unicode("لأآأئؤة"));|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|1454635|输出：|unstyled|1454636|array+(size=7)
++0+=>+string+'ل'+(length=2)
++1+=>+string+'أ'+(length=2)
++2+=>+string+'آ'+(length=2)
++3+=>+string+'أ'+(length=2)
++4+=>+string+'ئ'+(length=2)
++5+=>+string+'ؤ'+(length=2)
++6+=>+string+'ة'+(length=2)|1454637|欲了解更多信息，请访问：http://php.net/manual/en/function.str-split.php|offset|length|1454638|entityMap|0|LINK|mutability|MUTABLE|url|http://php.net/manual/en/function.str-split.php^0|0|0|0|C|1B|0|0^^$0|@$1|2|3|4|5|6|7|U|8|@]|9|@]|A|$B|C]]|$1|D|3|E|5|F|7|V|8|@]|9|@]|A|$]]|$1|G|3|H|5|6|7|W|8|@]|9|@]|A|$B|C]]|$1|I|3|J|5|F|7|X|8|@]|9|@$K|Y|L|Z|1|10]]|A|$]]|$1|M|3|-4|5|F|7|11|8|@]|9|@]|A|$]]]|N|$O|$5|P|Q|R|A|$S|T]]]]

<pre><code>function str_split_unicode($str, $l = 0) {
 if ($l &gt; 0) {
 $ret = array();
 $len = mb_strlen($str, "UTF-8");
 for ($i = 0; $i &lt; $len; $i += $l) {
 $ret[] = mb_substr($str, $i, $l, "UTF-8");
 }
 return $ret;
 }
 return preg_split("//u", $str, -1, PREG_SPLIT_NO_EMPTY);
}
var_dump(str_split_unicode("لأآأئؤة"));
</code></pre>

output:

<pre><code>array (size=7)
 0 =&gt; string 'ل' (length=2)
 1 =&gt; string 'أ' (length=2)
 2 =&gt; string 'آ' (length=2)
 3 =&gt; string 'أ' (length=2)
 4 =&gt; string 'ئ' (length=2)
 5 =&gt; string 'ؤ' (length=2)
 6 =&gt; string 'ة' (length=2)
</code></pre>

for more information : <a href="http://php.net/manual/en/function.str-split.php" rel="nofollow noreferrer">http://php.net/manual/en/function.str-split.php</a>

blocks|key|1590497|text|值得一提的是，从PHP7.4开始，有一个内置的函数，mb_str_split，它可以做到这一点。|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|1590498|$chars+=+mb_str_split($str);|code-block|syntax|javascript|1590499|与preg_split('//u',+$str)不同，它支持除UTF-8之外的其他编码。|style|CODE|1590500|entityMap|0|LINK|mutability|MUTABLE|url|https://www.php.net/manual/en/function.mb-str-split.php^0|Q|C|0|0|0|1|N|0^^$0|@$1|2|3|4|5|6|7|U|8|@]|9|@$A|V|B|W|1|X]]|C|$]]|$1|D|3|E|5|F|7|Y|8|@]|9|@]|C|$G|H]]|$1|I|3|J|5|6|7|Z|8|@$A|10|B|11|K|L]]|9|@]|C|$]]|$1|M|3|-4|5|6|7|12|8|@]|9|@]|C|$]]]|N|$O|$5|P|Q|R|C|$S|T]]]]

It's worth mentioning that since PHP 7.4 there's a built-in function, <a href="https://www.php.net/manual/en/function.mb-str-split.php" rel="nofollow noreferrer">mb_str_split</a>, that does this.
<pre><code>$chars = mb_str_split($str);
</code></pre>
Unlike <code>preg_split('//u', $str)</code> this supports encodings other than UTF-8.

In PHP, what is the best way to split a string into an array of Unicode characters? If the input is not necessarily UTF-8?

I want to know whether the set of Unicode characters in an input string is a subset of another set of Unicode characters.

Why not run straight for the <code>mb_</code> family of functions, as the first couple of answers didn't?

What is the best way to split a string into an array of Unicode characters in PHP?

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

在PHP中，将字符串拆分成Unicode字符数组的最佳方式是什么？如果输入不一定是UTF-8？我想知道输入字符串中的Unicode字符集是否是另一个Unicode字符集的子集。为什么不直接运行mb_系列函数，就像前两个答案没有做的那样？

问在PHP中，将字符串拆分成Unicode字符数组的最佳方式是什么？
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在PHP中，将字符串拆分成Unicode字符数组的最佳方式是什么？EN