我想从MRZ护照代码中提取姓名,出生日期,护照号码和其他信息。我的MRZ代码如下:
"P<YEMALSHARAFI<<MARWAH<ABDULBARI<MAHYOUB<S<<08033227<1YEM9201017F2412311<<<<<<<<<<<<<<08"
首先,我想跳过前缀"P<YEM"
然后存储名称变量"ALSHARAFI"
,"MARWAH"
,"ABDULBARI"
,"MAHYOUB"
,"S"
以variable_1,variable_2的,variable_3的,variable_4,variable_5和跳过任何"<<"
或"<"
。之后将护照NO存储"08033227"
到其变量中,然后存储国籍"YEM"
,性别"F"
,出生日期"92-01-01"
和到期日期"24-12-31"
。
发布于 2019-06-10 10:57:13
该表达式将逐步捕获所有信息:
([A-Z]+)<(.+?)<<((.+?)<?)<([A-Z]+)<<([0-9]+)<([0-9]+)([A-Z]+)([0-9]{2})([0-9]{2})([0-9]{2})([0-9]+)([A-Z])([0-9]{2})([0-9]{2})([0-9]{2})([0-9]+)([<]+)([0-9]+)
我不确定某些数字,但它很容易被修改。
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"([A-Z]+)<(.+?)<<((.+?)<?)<([A-Z]+)<<([0-9]+)<([0-9]+)([A-Z]+)([0-9]{2})([0-9]{2})([0-9]{2})([0-9]+)([A-Z])([0-9]{2})([0-9]{2})([0-9]{2})([0-9]+)([<]+)([0-9]+)";
string input = @"P<YEMALSHARAFI<<MARWAH<ABDULBARI<MAHYOUB<S<<08033227<1YEM9201017F2412311<<<<<<<<<<<<<<08
P<SMITH<<ALICE<BOB<GEORGE<KATE<OTHERS<S<<08033227<1YEM9201017F2412311<<<<<<<<<<<<<<08";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
jex.im可视化正则表达式:
const regex = /([A-Z]+)<(.+?)<<((.+?)<?)<([A-Z]+)<<([0-9]+)<([0-9]+)([A-Z]+)([0-9]{2})([0-9]{2})([0-9]{2})([0-9]+)([A-Z])([0-9]{2})([0-9]{2})([0-9]{2})([0-9]+)([<]+)([0-9]+)/gm;
const str = `P<YEMALSHARAFI<<MARWAH<ABDULBARI<MAHYOUB<S<<08033227<1YEM9201017F2412311<<<<<<<<<<<<<<08
P<SMITH<<ALICE<BOB<GEORGE<KATE<OTHERS<S<<08033227<1YEM9201017F2412311<<<<<<<<<<<<<<08`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
https://stackoverflow.com/questions/-100006957
复制相似问题