文章/答案/技术大牛

发布

社区首页 >问答首页 >PBXProject文件的正则表达式

问PBXProject文件的正则表达式
EN

Stack Overflow用户

提问于 2012-09-07 16:30:22

回答 2查看 152关注 0票数 2

工作在XCode项目文件解析器PBXProject的纯ruby实现上，在正则表达式方面需要一点帮助。

因此，PBXProject文件有一堆奇怪的代号行，其中混合了内容。我现在有的是正则表达式，(.*?) = (.*?)( \/\* (.*) \*\/)?; ?可以处理更简单的情况(第一行)。但是对于第二行，它太早了(到第一行；-character)。

isa = PBXBuildFile; fileRef = C0480C2015F4F91F00E0A2F4 /* zip.c */;

isa = PBXBuildFile; fileRef = C0480C2315F4F91F00E0A2F4 /* ZipArchive.mm */; settings = {COMPILER_FLAGS = "-fno-objc-arc"; };

因此，我想要的是简单的name = value对，即

isa = PBXBuildFile
settings = {COMPILER_FLAGS = "-fno-objc-arc"; }

用一个正则表达式实现这一点的简单方法？

ruby

regex

回答 2

Stack Overflow用户

回答已采纳

发布于 2012-09-07 19:33:26

这个正则表达式可以很好地工作：

[a-zA-Z0-9]*\s*?=\s*?.*?(?:{[^}]*}|(?=;))

请注意，只允许一个级别的括号，正则表达式将不会处理嵌套的括号。

从您的示例中，将捕获以下行：

isa = PBXBuildFile
fileRef = C0480C2015F4F91F00E0A2F4 /* zip.c */
isa = PBXBuildFile
fileRef = C0480C2315F4F91F00E0A2F4 /* ZipArchive.mm */
settings = {COMPILER_FLAGS = "-fno-objc-arc"; }

下面是regex的解释：

[a-zA-Z0-9]*\s*?=\s*?.*?(?:{[^}]*}|(?=;))

Options: ^ and $ match at line breaks

Match a single character present in the list below «[a-zA-Z0-9]*»
    Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
    A character in the range between “a” and “z” «a-z»
    A character in the range between “A” and “Z” «A-Z»
    A character in the range between “0” and “9” «0-9»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “=” literally «=»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match any single character that is not a line break character «.*?»
    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the regular expression below «(?:(?={){[^}]*}|(?=;))»
    Match either the regular expression below (attempting the next alternative only if this one fails) «(?={){[^}]*}»
        Match the character “{” literally «{»
        Match any character that is NOT a “}” «[^}]*»
            Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
        Match the character “}” literally «}»
    Or match regular expression number 2 below (the entire group fails if this one fails to match) «(?=;)»
        Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=;)»
            Match the character “;” literally «;»

票数 1

Stack Overflow用户

发布于 2012-09-07 18:49:08

根据您希望解析的内容的确切性质，可能无法使用单个有限表达式进行解析。您遇到问题的第二行表示可能涉及到嵌套模式。嵌套模式只能匹配到有限的深度，这就是为什么不建议使用regex解析XHTML的原因之一。如果你真的想要处理任意深度的嵌套，你可能会想看看像Treetop这样的东西。

如果你不需要它是健壮的，你可以尝试这样的表达式：

/((?i)(?:[^;]+=\s*\{.*?\})|[^;]+=[^;]+);/

这将首先尝试匹配something = {anything}形式的内容，如果不成功，它将在;之前匹配something = something。您应该能够使用string.scan(/regex/)查找给定字符串的所有匹配项。以这种方式处理块可以避免过早结束匹配过程等问题，并且可以轻松提取对。

进一步阅读：

Regular grammar
Context-free grammar

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/12314482

复制

相似问题

问PBXProject文件的正则表达式
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问PBXProject文件的正则表达式EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问PBXProject文件的正则表达式
EN