set服务器以utf-8编码提供响应,所有文件都以utf-8编码保存,我所知道的所有设置都被设置为utf-8编码。
下面是一个快速程序,用于测试输出是否正常工作:
<?php
$html = <<<HTML
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>Test!</title>
</head>
<body>
<h1>☆ Hello ☆ World ☆</h1>
</body>
</html>
HTML;
$dom = new DOMDocument("1.0", "utf-8");
$dom->loadHTML($html);
header("Content-Type: text/html; charset=utf-8");
echo($dom->saveHTML());
该程序的输出为:
<!DOCTYPE html>
<html><head><meta charset="utf-8"><title>Test!</title></head><body>
<h1>☆ Hello ☆ World ☆</h1>
</body></html>
它呈现为:
?˜†Hello?˜†World?˜†
我能做错什么呢?要告诉DOMDocument正确处理utf-8,我还需要更具体些吗?
发布于 2013-06-05 12:55:03
有一个更快的解决方法,在DOMDocument中加载html文档后,你只需设置(或者更好地说是重置)原始编码。下面是一个示例代码:
$dom = new DOMDocument();
$dom->loadHTML('<?xml encoding="UTF-8">' . $html);
foreach ($dom->childNodes as $item)
if ($item->nodeType == XML_PI_NODE)
$dom->removeChild($item);
$dom->encoding = 'UTF-8'; // reset original encoding
发布于 2012-07-03 18:52:55
<?php
header("Content-type: text/html; charset=utf-8");
$html = <<<HTML
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>Test!</title>
</head>
<body>
<h1>☆ Hello ☆ World ☆</h1>
</body>
</html>
HTML;
$html = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");
$dom = new DOMDocument("1.0", "utf-8");
$dom->loadHTML($html);
header("Content-Type: text/html; charset=utf-8");
echo($dom->saveHTML());
输出:
<!DOCTYPE html>
<html><head><meta charset="utf-8"><title>Test!</title></head><body>
<h1>☆ Hello ☆ World ☆</h1>
</body></html>
https://stackoverflow.com/questions/11309194
复制相似问题