文章/答案/技术大牛

发布

问如何在Regex中使用Unicode
EN

Stack Overflow用户

提问于 2016-06-23 09:17:41

回答 1查看 497关注 0票数 1

我正在编写一个regex以查找与文本文件中的Unicode char匹配的行。

!Regex.IsMatch(colCount.line, @"^"[\p{IsBasicLatin}\p{IsLatinExtended-A}\p{IsLatinExtended-B}]"+$")

下面是我所写的完整代码

var _fileName = @"C:\text.txt";

BadLinesLst = File
              .ReadLines(_fileName, Encoding.UTF8) 
              .Select((line, index) =>
               {
                 var count = line.Count(c => Delimiter == c) + 1;
                     if (NumberOfColumns < 0)
                           NumberOfColumns = count;

                             return new
                             {
                                 line = line,
                                 count = count,
                                 index = index
                             };
               })
               .Where(colCount => colCount.count != NumberOfColumns || (Regex.IsMatch(colCount.line, @"[^\p{IsBasicLatin}\p{IsLatinExtended-A}\p{IsLatinExtended-B}]")))
               .Select(colCount => colCount.line).ToList();

文件包含下面的行

264162-03,66，JITK,2007,12,874.000 ,0.000 ,0.000

6420œ50-00,67，JITK,2007,12,2292.000 ,0.000 ,0.000

4804元75-00，67，JITK,2007,12,1810.000 ,0.000 ,0.000

如果行的文件包含除BasicLatin或LatinExtended或LatinExtended之外的任何其他字符，那么我需要得到这些行。上面的Regex没有正常工作，这也显示了那些包含LatinExtended A或B的行

regex

unicode

asp.net

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-06-23 09:19:54

只需将Unicode类别类放入negated character class中即可。

if (Regex.IsMatch(colCount.line, 
         @"[^\p{IsBasicLatin}\p{IsLatinExtended-A}\p{IsLatinExtended-B}]")) 
{ /* Do sth here */ }

此正则表达式将找到部分匹配(因为Regex.IsMatch在较大的字符串中找到模式匹配)。该模式将匹配\p{IsBasicLatin}、\p{IsLatinExtended-A}和\p{IsLatinExtended-B} Unicode分类集中的字符以外的任何字符。

您还可以检查以下代码：

if (Regex.IsMatch(colCount.line, 
     @"^[^\p{IsBasicLatin}\p{IsLatinExtended-A}\p{IsLatinExtended-B}]*$")) 
{ /* Do sth here */ }

如果整个colCount.line字符串不包含在否定字符类中指定的3个Unicode类别类中的任何字符，或者-如果字符串是空的(如果您想不允许取空字符串，请将*替换为+ )，则返回true。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/37987284

复制

相似问题

问如何在Regex中使用Unicode
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在Regex中使用UnicodeEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在Regex中使用Unicode
EN