前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >正则表达式

正则表达式

原创
作者头像
云台大树
修改2021-08-25 11:29:21
3930
修改2021-08-25 11:29:21
举报

If you’re thinking in terms of object-oriented programming, your first impulse might be to start defining objects for the various elements in the world: a class for the robot, one for a parcel, maybe one for places. These could then hold properties that describe their current state, such as the pile of parcels at a location, which we could change when updating the world.

This is wrong.

At least, it usually is. The fact that something sounds like an object does not automatically mean that it should be an object in your program. Reflexively writing classes for every concept in your application tends to leave you with a collection of interconnected objects that each have their own internal, changing state. Such programs are often hard to understand and thus easy to break.

声明表达式

// \是转移符号
let re1 = new RegExp("abc");
let re2 = /abc/;

测试匹配结果

/abc/.test("abcded");
"abcde".match(/abc/);

匹配字符集

Within square brackets, a hyphen (-) between two characters can be used to indicate a range of characters, where the ordering is determined by the character’s Unicode number

/[0-9A-z]/.test("abc234");

匹配符

描述

\d

any digit character

\w

an alphanumeric character (word character)

\s

any whitespace character (space, tab, newline, and similar)

\D

a character that is NOT a digit

\W

a nonalphanumeric character

\S

a nonwhitespace character

.

any character expect for newline

^

在[]中表示, invert of a set of characters - 不包括

let notBinary = /[^01]/;
console.log(notBinary.test("0101010101")); // → false

匹配次数

匹配符

描述

+

element may repeats more than once (>=1)

*

element may repeats more than ZERO times (>=0)

element may repeats ZERO or One time( 0 ? 1)

{m}

element matches m times ( = m)

{n,m}

element matches at least n time, at most m times ( n<= & <=m)

{m, }

element matches at least m time ( >= m)

匹配组

To use an operator like * or + on more than one element at a time, you have to use parentheses. A part of a regular expression that is enclosed in parentheses counts as a single element as far as the operators following it are concerned

// 第一个+ 只对boo的第二个o有效,
// 第二个+ 只对括号里面的hoo的第二个o有效
// 第三个+ 只对整个组(hoo+)有效
let cartoonCrying = /boo+(hoo+)+/i;
console.log(cartoonCrying.test("Boohooooohoohooo")); // → true

正则表达式的test方法只返回匹配是否成功结果, exec方法可以返回匹配到的结果对象

When the regular expression contains subexpressions grouped with parentheses, the text that matched those groups will also show up in the array.The whole match is always the first element. The next element is the part matched by the first group (the one whose opening parenthesis comes first in the expression), then the second group, and so on [先匹配整个表达式,然后按照括号顺序匹配组] 这个特性非常好

let quotedText = /'([^']*)'/;
quatedText.exec("she said 'hello'"); // → ["'hello'", "hello"]

When a group does not end up being matched at all (for example, when followed by a question mark), its position in the output array will hold undefined. Similarly, when a group is matched multiple times, only the last match ends up in the array.

// 匹配组是问号结尾时候(0或1),会额外匹配一次,值是undefined(表示0次匹配)
console.log(/bad(ly)?/.exec("bad")); // → ["bad", undefined]
// a group matches multiple times, only the last match counts!
console.log(/(\d)+/.exec("123")); // → ["123", "3"]

匹配组的特性示例, 非常好

function getDate(string) {
  // _ 接收 全匹配, month接收第一个组的匹配结果,day,接收第二个组的匹配结果,year接收第三个
  let [_, month, day, year] =
    /(\d{1,2})-(\d{1,2})-(\d{4})/.exec(string);
  return new Date(year, month - 1, day); // 月份从0开始,日期从1开始
}
console.log(getDate("1-30-2003")); // → Thu Jan 30 2003 00:00:00 GMT+0100 (CET)

匹配组的代号

The $1 and $2 in the replacement string refer to the parenthesized groups in the pattern. $1 is replaced by the text that matched against the first group, $2 by the second, and so on, up to $9. The whole match can be referred to with $&.

下面这个例子是 把名和姓换个方向显示

console.log(
  "Liskov, Barbara\nMcCarthy, John\nWadler, Philip"
    .replace(/(\w+), (\w+)/g, "$2 $1"));
// → Barbara Liskov
//   John McCarthy
//   Philip Wadler

这个例子中,把函数作为参数传入,该参数可以接收的参数见https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#specifying_a_function_as_a_parameter

let stock = "1 lemon, 2 cabbages, and 101 eggs";
function minusOne(match, amount, unit) {
  amount = Number(amount) - 1;
  if (amount == 1) { // only one left, remove the 's'
    unit = unit.slice(0, unit.length - 1);
  } else if (amount == 0) {
    amount = "no";
  }
  return amount + " " + unit;
}
console.log(stock.replace(/(\d+) (\w+)/g, minusOne));
// → no lemon, 1 cabbage, and 100 eggs

词和字符串边界

但是getDate函数有个问题,它无法正确判断字符串"100-1-3000"(匹配到结果 00-1-3000). 因此需要将正则表达式用于整个字符串,所以推出匹配符号^(以x开头)和$(以x结尾), 和\b(单词边界)

A word boundary can be the start or end of the string or any point in the string that has a word character (as in \w) on one side and a nonword character on the other

特别重要的说明,\b可以出现在正则表达式的最前或者最后,如果\b符号的左右两侧不都是\w,则满足匹配。也就是说,匹配一个单词的起或始位置。只有起始位置时,才会出现一边是字符一边不是字符

console.log(/\bcat\b/.exec(" con cat enate")); // → ["cat"]
console.log(/a\b.\bnice/.exec("it's a nice day")); // → [ 'a nice', index: 5, input: "it's a nice day", groups: undefined ]

匹配或

使用符号|匹配选择

let animalCount = /\b\d+ (pig|cow|chicken)s?\b/;
console.log(animalCount.test("15 pigs")); // → true
console.log(animalCount.test("15 pigchickens")); // → false

匹配或的工作机理如下图所示

匹配或的工作机理图
匹配或的工作机理图

以上述表达式做解释,匹配字符串"the 3 pigs"

在第4个位置,匹配到了字符边界,匹配通过第一个检查盒子;同时,在第4个位置,找到一个数字,匹配校验通过第二个检查盒子;依次类推,

表达式

说明

/abc/

A sequence of characters

/[abc]/

Any character from a set of characters

/[^abc]/

Any character not in a set of characters

/[0-9]/

Any character in a range of characters

/x+/

One or more occurrences of the pattern x

/x+?/

One or more occurrences, nongreedy

/x*/

Zero or more occurrences

/x?/

Zero or one occurrence

/x{2,4}/

Two to four occurrences

/(abc)/

A group

/a|b|c/

Any one of several patterns

/\d/

Any digit character

/\w/

An alphanumeric character (“word character”)

/\s/

Any whitespace character

/./

Any character except newlines

/\b/

A word boundary

/^/

Start of input

/$/

End of input

匹配不包含e或E的单词

let regexp = /\b[^eE\s]+\b/; // \s 和 + 用的很精髓!
console.log(regexp.test("earth bed")); // → false
console.log(regexp.exec("learning ape")); // → null
console.log(regexp.exec("BEET")); // → null

单引号替换

let text = "'I'm the cook,' he said, 'it's my job.'"; 
console.log(text.replace(/\b(^'|'$)\b/g, "1")); // → "I'm the cook," he said, "it's my job."

提取字符串中的数字

(-|\+|)这个匹配组很精秒,匹配(plus, minus, or nothing)

let number = /^(-|\+|)(\d+|(\.\d+|\d+\.\d*)|(\d+(\.|)\d*e(-|\+|)\d+))$/i;
​
// Tests:
for (let str of ["1", "-1", "+15", "1.55", ".5", "5.",
                 "1.3e2", "1E-4", "1e+12"]) {
  if (!number.test(str)) {
    console.log(`Failed to match '${str}'`);
  }
}
for (let str of ["1a", "+-1", "1.2.3", "1+1", "1e4.5",
                 ".5.", "1f5", "."]) {
  if (number.test(str)) {
    console.log(`Incorrectly accepted '${str}'`);
  }
}

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档