原来的文字编码为
<?php
年终工作总结
?>
正常文字为:年终工作总结
怎么转换为正常文字?
sanders_yao给出了一段代码
<?php
$body = "年终工作总结";
$body = preg_replace('/&#d;/ue', "utf8_entity_decode('')", $body );
$body = preg_replace('/&#x([a-fA-F0-9]);/ue', "utf8_entity_decode('&#'.hexdec('').';')", $body );
echo $body;
function utf8_entity_decode($entity){
$convmap = array(0x0, 0x10000, 0, 0xfffff);
return mb_decode_numericentity($entity, $convmap, 'UTF-8');
}
?>
果然可以转换,wamper说:
HTML-ENTITIES
mb_convert_encoding就可以转了
于是去查了一下手册,发现html_entity_decode也可以转换
<?php
$body = '15810198646 ';
echo html_entity_decode($body,ENT_COMPAT,'UTF-8');
?>
html_entity_decode
(PHP 4 >= 4.3.0, PHP 5)
html_entity_decode -- Convert all HTML entities to their applicable characters
Description
string
html_entity_decode ( string string [, int quote_style [, string charset]] )
html_entity_decode() is the opposite of htmlentities() in that it converts all HTML entities to their applicable characters from string.
The optional second quote_style parameter lets you define what will be done with 'single' and "double" quotes. It takes on one of three constants with the default being ENT_COMPAT:
表 1. Available quote_style constants
| Constant Name |
Description |
| ENT_COMPAT |
Will convert double-quotes and leave single-quotes alone. |
| ENT_QUOTES |
Will convert both double and single quotes. |
| ENT_NOQUOTES |
Will leave both double and single quotes unconverted. |
The ISO-8859-1 character set is used as default for the optional third charset. This defines the character set used in conversion.
PHP 4.3.0 及其后续版本支持如下字符集。
表 2. 已支持字符集
| 字符集 |
别名 |
描述 |
| ISO-8859-1 |
ISO8859-1 |
西欧,Latin-1 |
| ISO-8859-15 |
ISO8859-15 |
西欧,Latin-9。增加了 Latin-1(ISO-8859-1)中缺少的欧元符号、法国及芬兰字母。 |
| UTF-8 |
|
ASCII 兼容多字节 8-bit Unicode。 |
| cp866 |
ibm866, 866 |
DOS-特有的 Cyrillic 字母字符集。PHP 4.3.2 开始支持该字符集。 |
| cp1251 |
Windows-1251, win-1251, 1251 |
Windows-特有的 Cyrillic 字母字符集。PHP 4.3.2 开始支持该字符集。 |
| cp1252 |
Windows-1252, 1252 |
Windows 对于西欧特有的字符集。 |
| KOI8-R |
koi8-ru, koi8r |
俄文。PHP 4.3.2 开始支持该字符集。 |
| BIG5 |
950 |
繁体中文,主要用于中国台湾。 |
| GB2312 |
936 |
简体中文,国际标准字符集。 |
| BIG5-HKSCS |
|
繁体中文,Big5 的延伸,主要用于香港。 |
| Shift_JIS |
SJIS, 932 |
日文。 |
| EUC-JP |
EUCJP |
日文。 |
注意: ISO-8859-1 将代替任何其它无法识别的字符集。
注意: This function doesn't support multi-byte character sets in PHP < 5.