文章/答案/技术大牛

发布

社区首页 >问答首页 >带有特殊字符的X字符后面的子字符串

问带有特殊字符的X字符后面的子字符串
EN

Stack Overflow用户

提问于 2012-07-24 18:50:35

回答 5查看 4K关注 0票数 4

抱歉，我真的不知道怎么说.

我经常有一个字符串，需要在X字符之后剪切，我的问题是这个字符串通常包含一些特殊的字符，比如：& egrave；

所以，我想知道，他们是一种在php中了解的方式，而不需要转换我的字符串，如果我在切割字符串时，我处于一个特殊的字符中间。

示例

This is my string with a special char : &egrave; - and I want it to cut in the middle of the "&egrave;" but still keeping the string intact

所以现在我的子字符串的结果是：

This is my string with a special char : &egra

但我想要这样的东西：

This is my string with a special char : &egrave;

special-characters

php

回答 5

Stack Overflow用户

回答已采纳

发布于 2012-07-24 19:10:37

这里最好的方法是将字符串存储为UTF-8，没有任何html实体，并使用mb_*系列函数，并以utf8作为编码。

但是，如果您的字符串是ASCII或iso-8859-1/ can 1252，则可以使用mb_string库的特殊mb_string编码：

$s = 'This is my string with a special char : &egrave; - and I want it to cut in the middle of the "&egrave;" but still keeping the string intact';
echo mb_substr($s, 0, 40, 'HTML-ENTITIES');
echo mb_substr($s, 0, 41, 'HTML-ENTITIES');

但是，如果您的底层字符串是UTF-8或其他多字节编码的，则使用HTML-ENTITIES是不安全的！这是因为HTML-ENTITIES真正的意思是“作为html实体具有高位字符的win1252”。这是一个可能出错的例子：

// Assuming that é is in utf8:
mb_substr('é ', 0, 2, 'HTML-ENTITIES') === '&Atilde;&copy;'
// should be '&eacute; '

当您的字符串是多字节编码时，必须在拆分之前将所有html实体转换为公共编码。例如：

$strings_actual_encoding = 'utf8';
$s_noentities = html_entity_decode($s, ENT_QUOTES, $strings_actual_encoding); 
$s_trunc_noentities =  mb_substr($s_noentities, 0, 41, $strings_actual_encoding);

票数 7

Stack Overflow用户

发布于 2012-07-24 19:04:12

最好的解决方案是将文本存储为UTF-8，而不是将它们存储为HTML实体。除此之外，如果您不介意计数为off (`等于一个字符，而不是7)，那么下面的代码段应该可以工作：

<?php
$string = 'This is my string with a special char : &egrave; - and I want it to cut in the middle of the "&egrave;" but still keeping the string intact';
$cut_string = htmlentities(mb_substr(html_entity_decode($string, NULL, 'UTF-8'), 0, 45), NULL, 'UTF-8')."<br><br>";

注意:如果您使用不同的函数对文本进行编码(例如，htmlspecialchars()__)，则使用该函数而不是htmlentities()__。如果使用自定义函数，则使用另一个与新的自定义函数相反的自定义函数，而不是html_entity_decode() (而不是htmlentities()__)。

票数 4

Stack Overflow用户

发布于 2012-07-24 18:55:19

您可以首先使用html_entity_decode()来解码所有的HTML。那就劈开你的绳子。然后htmlentities()重新编码这些实体。

$decoded_string = html_entity_decode($original_string);
// implement logic to split string here

// then for each string part do the following:
$encoded_string_part = htmlentities($split_string_part);

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/11637390

复制

相似问题

问带有特殊字符的X字符后面的子字符串
EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问带有特殊字符的X字符后面的子字符串EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问带有特殊字符的X字符后面的子字符串
EN