C#十六进制值0x12,是一个无效的字符?

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (2)
  • 关注 (0)
  • 查看 (899)

我加载了很多xml文档,其中一些返回错误,如“hexadecimal value 0x12, is an invalid character”,并且有不同的字符。如何删除它们?

提问于
用户回答回答于

这是ASCII表格。有128个符号

这里是一些小的测试代码,它添加了ASCII表中的每个符号,并尝试将其作为XML文档加载。

static public void RegexTry()
{
    StreamReader stream = new StreamReader(@"test.xml");
    string xmlfile = stream.ReadToEnd();
    stream.Close();

    string text = "";

    for (int i = 0; i < 128; i++ )
    {
        char t = (char) i;

        text = xmlfile.Replace('П', t);

        XmlDocument xml = new XmlDocument();
        try
        {
            xml.LoadXml(text);
        }
        catch (Exception ex)
        {
            Console.WriteLine("Char("+i.ToString() +"): " + t + " => error! " + ex.Message);
            continue;
        }

        Console.WriteLine("Char(" + i.ToString() + "): " + t + " => fine!");
    }

    Console.ReadKey();
}

结果它返回:

Char(0): => error! '.', hexadecimal value 0x00, is an invalid character. Line 5, position 7.
Char(1): => error! '', hexadecimal value 0x01, is an invalid character. Line 5, position 7.
Char(2): => error! '', hexadecimal value 0x02, is an invalid character. Line 5, position 7.
Char(3): => error! '', hexadecimal value 0x03, is an invalid character. Line 5, position 7.
Char(4): => error! '', hexadecimal value 0x04, is an invalid character. Line 5, position 7.
Char(5): => error! '', hexadecimal value 0x05, is an invalid character. Line 5, position 7.
Char(6): => error! '', hexadecimal value 0x06, is an invalid character. Line 5, position 7.
Char(7): => error! '', hexadecimal value 0x07, is an invalid character. Line 5, position 7.
Char(8): => error! '', hexadecimal value 0x08, is an invalid character. Line 5, position 7.
Char(9):     => fine!
Char(10): 
 => fine!
Char(11): => error! '', hexadecimal value 0x0B, is an invalid character. Line 5, position 7.
Char(12): => error! '', hexadecimal value 0x0C, is an invalid character. Line 5, position 7.
Char(13): 
 => fine!
Char(14): => error! '', hexadecimal value 0x0E, is an invalid character. Line 5, position 7.
Char(15): => error! '', hexadecimal value 0x0F, is an invalid character. Line 5, position 7.
Char(16): => error! '', hexadecimal value 0x10, is an invalid character. Line 5, position 7.
Char(17): => error! '', hexadecimal value 0x11, is an invalid character. Line 5, position 7.
Char(18): => error! '', hexadecimal value 0x12, is an invalid character. Line 5, position 7.
Char(19): => error! '', hexadecimal value 0x13, is an invalid character. Line 5, position 7.
Char(20): => error! '', hexadecimal value 0x14, is an invalid character. Line 5, position 7.
Char(21): => error! '', hexadecimal value 0x15, is an invalid character. Line 5, position 7.
Char(22): => error! '', hexadecimal value 0x16, is an invalid character. Line 5, position 7.
Char(23): => error! '', hexadecimal value 0x17, is an invalid character. Line 5, position 7.
Char(24): => error! '', hexadecimal value 0x18, is an invalid character. Line 5, position 7.
Char(25): => error! '', hexadecimal value 0x19, is an invalid character. Line 5, position 7.
Char(26): => error! '', hexadecimal value 0x1A, is an invalid character. Line 5, position 7.
Char(27): => error! '', hexadecimal value 0x1B, is an invalid character. Line 5, position 7.
Char(28): => error! '', hexadecimal value 0x1C, is an invalid character. Line 5, position 7.
Char(29): => error! '', hexadecimal value 0x1D, is an invalid character. Line 5, position 7.
Char(30): => error! '', hexadecimal value 0x1E, is an invalid character. Line 5, position 7.
Char(31): => error! '', hexadecimal value 0x1F, is an invalid character. Line 5, position 7.
Char(32):   => fine!
Char(33): ! => fine!
Char(34): " => fine!
Char(35): # => fine!
Char(36): $ => fine!
Char(37): % => fine!
Char(38): => error! An error occurred while parsing EntityName. Line 5, position 8.
Char(39): ' => fine!
Char(40): ( => fine!
Char(41): ) => fine!
Char(42): * => fine!
Char(43): + => fine!
Char(44): , => fine!
Char(45): - => fine!
Char(46): . => fine!
Char(47): / => fine!
Char(48): 0 => fine!
Char(49): 1 => fine!
Char(50): 2 => fine!
Char(51): 3 => fine!
Char(52): 4 => fine!
Char(53): 5 => fine!
Char(54): 6 => fine!
Char(55): 7 => fine!
Char(56): 8 => fine!
Char(57): 9 => fine!
Char(58): : => fine!
Char(59): ; => fine!
Char(60): => error! The '<' character, hexadecimal value 0x3C, cannot be included in a name. Line 5, position 13.
Char(61): = => fine!
Char(62): > => fine!
Char(63): ? => fine!
Char(64): @ => fine!
Char(65): A => fine!
Char(66): B => fine!
Char(67): C => fine!
Char(68): D => fine!
Char(69): E => fine!
Char(70): F => fine!
Char(71): G => fine!
Char(72): H => fine!
Char(73): I => fine!
Char(74): J => fine!
Char(75): K => fine!
Char(76): L => fine!
Char(77): M => fine!
Char(78): N => fine!
Char(79): O => fine!
Char(80): P => fine!
Char(81): Q => fine!
Char(82): R => fine!
Char(83): S => fine!
Char(84): T => fine!
Char(85): U => fine!
Char(86): V => fine!
Char(87): W => fine!
Char(88): X => fine!
Char(89): Y => fine!
Char(90): Z => fine!
Char(91): [ => fine!
Char(92): \ => fine!
Char(93): ] => fine!
Char(94): ^ => fine!
Char(95): _ => fine!
Char(96): ` => fine!
Char(97): a => fine!
Char(98): b => fine!
Char(99): c => fine!
Char(100): d => fine!
Char(101): e => fine!
Char(102): f => fine!
Char(103): g => fine!
Char(104): h => fine!
Char(105): i => fine!
Char(106): j => fine!
Char(107): k => fine!
Char(108): l => fine!
Char(109): m => fine!
Char(110): n => fine!
Char(111): o => fine!
Char(112): p => fine!
Char(113): q => fine!
Char(114): r => fine!
Char(115): s => fine!
Char(116): t => fine!
Char(117): u => fine!
Char(118): v => fine!
Char(119): w => fine!
Char(120): x => fine!
Char(121): y => fine!
Char(122): z => fine!
Char(123): { => fine!
Char(124): | => fine!
Char(125): } => fine!
Char(126): ~ => fine!
Char(127):  => fine!  

你可以看到有很多符号不能用XML代码。为了替换它们,可以使用Reqex.Replace

static string ReplaceHexadecimalSymbols(string txt)
{
    string r = "[\x00-\x08\x0B\x0C\x0E-\x1F\x26]";
    return Regex.Replace(txt, r,"",RegexOptions.Compiled);
}
用户回答回答于

XML规范定义了这样的字符:

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

扫码关注云+社区

领取腾讯云代金券