首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >PBXProject文件的正则表达式

PBXProject文件的正则表达式
EN

Stack Overflow用户
提问于 2012-09-07 16:30:22
回答 2查看 152关注 0票数 2

工作在XCode项目文件解析器PBXProject的纯ruby实现上,在正则表达式方面需要一点帮助。

因此,PBXProject文件有一堆奇怪的代号行,其中混合了内容。我现在有的是正则表达式,(.*?) = (.*?)( \/\* (.*) \*\/)?; ?可以处理更简单的情况(第一行)。但是对于第二行,它太早了(到第一行;-character)。

代码语言:javascript
复制
isa = PBXBuildFile; fileRef = C0480C2015F4F91F00E0A2F4 /* zip.c */;

isa = PBXBuildFile; fileRef = C0480C2315F4F91F00E0A2F4 /* ZipArchive.mm */; settings = {COMPILER_FLAGS = "-fno-objc-arc"; };

因此,我想要的是简单的name = value对,即

代码语言:javascript
复制
isa = PBXBuildFile
settings = {COMPILER_FLAGS = "-fno-objc-arc"; }

用一个正则表达式实现这一点的简单方法?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2012-09-07 19:33:26

这个正则表达式可以很好地工作:

代码语言:javascript
复制
[a-zA-Z0-9]*\s*?=\s*?.*?(?:{[^}]*}|(?=;))

请注意,只允许一个级别的括号,正则表达式将不会处理嵌套的括号。

从您的示例中,将捕获以下行:

代码语言:javascript
复制
isa = PBXBuildFile
fileRef = C0480C2015F4F91F00E0A2F4 /* zip.c */
isa = PBXBuildFile
fileRef = C0480C2315F4F91F00E0A2F4 /* ZipArchive.mm */
settings = {COMPILER_FLAGS = "-fno-objc-arc"; }

下面是regex的解释:

代码语言:javascript
复制
[a-zA-Z0-9]*\s*?=\s*?.*?(?:{[^}]*}|(?=;))

Options: ^ and $ match at line breaks

Match a single character present in the list below «[a-zA-Z0-9]*»
    Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
    A character in the range between “a” and “z” «a-z»
    A character in the range between “A” and “Z” «A-Z»
    A character in the range between “0” and “9” «0-9»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “=” literally «=»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match any single character that is not a line break character «.*?»
    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the regular expression below «(?:(?={){[^}]*}|(?=;))»
    Match either the regular expression below (attempting the next alternative only if this one fails) «(?={){[^}]*}»
        Match the character “{” literally «{»
        Match any character that is NOT a “}” «[^}]*»
            Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
        Match the character “}” literally «}»
    Or match regular expression number 2 below (the entire group fails if this one fails to match) «(?=;)»
        Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=;)»
            Match the character “;” literally «;»
票数 1
EN

Stack Overflow用户

发布于 2012-09-07 18:49:08

根据您希望解析的内容的确切性质,可能无法使用单个有限表达式进行解析。您遇到问题的第二行表示可能涉及到嵌套模式。嵌套模式只能匹配到有限的深度,这就是为什么不建议使用regex解析XHTML的原因之一。如果你真的想要处理任意深度的嵌套,你可能会想看看像Treetop这样的东西。

如果你不需要它是健壮的,你可以尝试这样的表达式:

代码语言:javascript
复制
/((?i)(?:[^;]+=\s*\{.*?\})|[^;]+=[^;]+);/

这将首先尝试匹配something = {anything}形式的内容,如果不成功,它将在;之前匹配something = something。您应该能够使用string.scan(/regex/)查找给定字符串的所有匹配项。以这种方式处理块可以避免过早结束匹配过程等问题,并且可以轻松提取对。

进一步阅读:

  • Regular grammar
  • Context-free grammar
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/12314482

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档