首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >用于长句子的Javascript RegExp

用于长句子的Javascript RegExp
EN

Stack Overflow用户
提问于 2018-08-10 02:38:16
回答 1查看 46关注 0票数 -1

当我在电子邮件中使用getPlainBody()函数时,我试图删除系统信息。

我试着写一个regexp,但是这个对我来说太难了:

androidOS(v7.99.2)(android)(3593)(rev.136)(cbf87b2346eabe6ef)(6c72426bbc-151c-449c-a33d-3733234d404f)(SomeuserName23542)系统信息: Mi A1,

我试着.replace(/System+[a-zA-Z0-9._-]+)/gi,''),但它给出了一个错误,我也尝试玩的第一部分和最后一部分,但似乎我只是不太明白规则。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-08-10 22:52:12

更新

由于您已经声明需要删除所有以System info:开头并以7组带括号的字符串结尾的行,这对您来说应该是有效的:

代码语言:javascript
复制
.replace(/^(?:System info:)(?:[^(]+(?=\())?(?:\([^)]+\)){0,7}$/gim, '');

该模式将匹配最多7组带括号的字符串(我不确定是否总是有7组,所以我将其视为上限)。

分解这种模式:

代码语言:javascript
复制
^                   // start of line (multiline mode)
(?:                 // start non-capturing group
    System info:    // exactly match the literal text "System info:"
)                   // end non-capturing group
(?:                 // start non-capturing group
    [^(]            // match anything that is not a literal "("
    +               //      at least once, and as many times as possible
    (?=             // start positive lookahead group
        \(          // match a literal "("
    )               // end positive lookahead group
)                   // end non-capturing group
?                   // make it optional
(?:                 // start non-capturing group
    \(              // match a literal "("
    [^)]+           // match anything that is not a literal ")"
    \)              // match a literal ")"
)                   // end non-capturing group
{0,7}               // between exactly 0 and 7 times.
$                   // end of line (multiline mode)

You can test strings against the match here.

根据记录,+告诉RegEx引擎至少匹配它之前的任何内容一次,并且尽可能多次,贪婪地(这意味着引擎只会在绝对必要的情况下才会返回字符,以便进行整体匹配)。

原始

在不了解更多您想要的输出的情况下,我对您正在寻找的内容的最佳猜测是这样的(这可能需要一些解释):

代码语言:javascript
复制
.replace(/^System info:[\w\d\s(),._-]+$/gim, '');

分解一下..。

代码语言:javascript
复制
^                   // start of line (in multiline mode)
System info:        // exactly match the literal string "System info:"
[\w\d\s(),._-]+     // match any amount of characters that are either:
                    //      "A" through "Z",
                    //      or "a" through "z",
                    //      or "0" through "9",
                    //      or are whitespace,
                    //      or a literal "(",
                    //      or a literal ")",
                    //      or a literal ",",
                    //      or a literal ".",
                    //      or a literal "_",
                    //      or a literal "-",
$                   // end of line (in multiline mode)

You can test it here。另外,请注意正则表达式replace上的m标志,该标志打开多行模式,允许^在每行的开头匹配,而不是在整个字符串的开头匹配,并允许$在每行的结尾匹配,而不是在整个字符串的结尾匹配。

除非。您希望捕获信息(这会使正则表达式更加复杂):

代码语言:javascript
复制
^(System(?:\s+)?info):(?:(?:(?:\s+)?((?:[\w\d._-]+)?(?:(?:\([\w\d.-]+\))+)?)?,?))+$

当然,分解一下...

代码语言:javascript
复制
^                       // start of line (in multiline mode)
(                       // start of first capture group
    System              // exactly match the string "System"
    (?:                 // start a non-capturing group
        \s+             // match any amount of whitespace
    )?                  // end non-capturing group and make the whole thing optional
    info                // exactly match the string "info"
)                       // end of first capture group
:                       // exactly match the string ":"
(?:                     // start a non-capturing group
    \s+                 // match any amount of whitespace
)?                      // end non-capturing group and make the whole thing optional
(                       // start of second capture group
    (?:                 // start a non-capturing group
        [\w\d._-]+      // match any amount of characters that are either:
                        //      "A" through "Z",
                        //      or "a" through "z",
                        //      or "0" through "9",
                        //      or a literal ".",
                        //      or a literal "_",
                        //      or a literal "-",
    )?                  // end non-capturing group and make the whole thing optional
    (?:                 // start a non-capturing group
        (?:             // start a non-capturing group
            \(          // exactly match a literal "("
            [\w\d.-]+   // match any amount of characters that are either:
                        //      "A" through "Z",
                        //      or "a" through "z",
                        //      or "0" through "9",
                        //      or a literal ".",
                        //      or a literal "_",
                        //      or a literal "-",
            \)          // exactly match a literal ")"
        )+              // end non-capturing group and make the whole thing required
    )?                  // end non-capturing group and make the whole thing optional
    ,?                  // exactly match a literal "," and make it optional
)+                      // end second capture group and make the whole thing required
$                       // end of line (in multiline mode)

You can test it here

学习更多正则表达式的另一个很好的资源是https://www.regular-expressions.info/ (尽管我不相信那里有任何内置的沙箱,就像在https://regex101.com中那样)。

最后,正如BarmarCertainPerformance在评论中正确地指出的那样,您尝试的.replace(/System+[a-zA-Z0-9._-]+)/gi,'')解决方案将不会工作,原因有两个:

  1. 结束)未标记为文字字符(即\)[)] ),并且与开始的非文字( anywhere不匹配,这将导致错误。
  2. System+后面的m不会与System后面的空格匹配,但会与System匹配,或者匹配文字Syste后跟任意数量的m(例如Systemmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm).
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51773798

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档