blocks|key|2376241|text|一旦模式匹配，匹配就会发生。没有前瞻来确保模式不会重复出现。我不确定它是否足够通用，但在所提供的单个测试用例中，使用带有否定操作符的字符类是有效的|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2376242|>+gsub(pattern="+\\.[%5E.]%7C+\\.$",replacement=".",x="Something+.")
[1]+"Something."
>+gsub(pattern="+\\.[%5E.]%7C+\\.$",replacement=".",x="Something+..")
[1]+"Something+.."|code-block|syntax|javascript|2376243|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

The matching occurs as soon as the pattern matches. There is no look-forward to make sure the pattern is not recurring. I'm not sure if it's general enough but using a character class with a negation operator works in the offered single test case

<pre><code>&gt; gsub(pattern=" \\.[^.]| \\.$",replacement=".",x="Something .")
[1] "Something."
&gt; gsub(pattern=" \\.[^.]| \\.$",replacement=".",x="Something ..")
[1] "Something .."
</code></pre>

blocks|key|2376290|text|您可以删除从最后一个空格到.的所有内容，然后在字符串的末尾粘贴一个.，不是吗？|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|2376291|#+anything+followed+by+any+amount+of+space+followed+
#+by+<+followed+by+anything+until+the+end+of+the+sentence
paste0(gsub("(.*)[+].*<.*$",+"\\1",+tt),+".")
#+[1]+"Some+sentence+is+about+something."|code-block|syntax|javascript|2376292|也就是说，你应该使用。|2376293|或者，如果标记出现在句子中间，而您只想删除它们及其周围的空格，则：|2376294|#+remove+everything+within+<...>+including+<+and+>+
#+and+any+spaces+surrounding+them
gsub("[+]*<.*?>[+]*",+"",+tt)
#+[1]+"Some+sentence+is+about+something."

#+example:
tt+<-+"..+some+sentences+are+wrong+<bla+bla>.+But+some+are+<bla+bla>+right."
gsub("[+]*<.*?>[+]*",+"",+tt)
#+[1]+"..+some+sentences+are+wrong.+But+some+are+right."|2376295|请注意.*>和.*?>之间的区别。第一个是“贪婪的”，因为它会匹配所有的字符，直到最后一个>。然而，第二个匹配将在第一个匹配后停止，这在这里是可取的，并且您希望删除所有出现的项。|2376296|entityMap^0|D|1|X|1|0|0|0|0|0|3|3|7|4|0^^$0|@$1|2|3|4|5|6|7|U|8|@$9|V|A|W|B|C]|$9|X|A|Y|B|C]]|D|@]|E|$]]|$1|F|3|G|5|H|7|Z|8|@]|D|@]|E|$I|J]]|$1|K|3|L|5|6|7|10|8|@]|D|@]|E|$]]|$1|M|3|N|5|6|7|11|8|@]|D|@]|E|$]]|$1|O|3|P|5|H|7|12|8|@]|D|@]|E|$I|J]]|$1|Q|3|R|5|6|7|13|8|@$9|14|A|15|B|C]|$9|16|A|17|B|C]]|D|@]|E|$]]|$1|S|3|-4|5|6|7|18|8|@]|D|@]|E|$]]]|T|$]]

You can remove everything from the last space upto the <code>.</code> and paste a <code>.</code> at the end of the string, no?

<pre><code># anything followed by any amount of space followed 
# by &lt; followed by anything until the end of the sentence
paste0(gsub("(.*)[ ].*&lt;.*$", "\\1", tt), ".")
# [1] "Some sentence is about something."
</code></pre>

That said, you should <a href="https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags">really read this</a>.

Alternatively, if the markup occurs in the middle of a sentence and you just want to remove them and the spaces around them, then:

<pre><code># remove everything within &lt;...&gt; including &lt; and &gt; 
# and any spaces surrounding them
gsub("[ ]*&lt;.*?&gt;[ ]*", "", tt)
# [1] "Some sentence is about something."

# example:
tt &lt;- ".. some sentences are wrong &lt;bla bla&gt;. But some are &lt;bla bla&gt; right."
gsub("[ ]*&lt;.*?&gt;[ ]*", "", tt)
# [1] ".. some sentences are wrong. But some are right."
</code></pre>

Note the difference between <code>.*&gt;</code> and <code>.*?&gt;</code>. The first one is "greedy" in the sense that it'll match all characters until the last >. Whereas, the second one will stop after the first match, which is desirable here and you want to remove every occurrence.

blocks|key|5728651|text|您可以使用Perl正则表达式中的负先行模式来实现您想要的结果。这基本上是说要匹配模式，但只有在没有这个模式的情况下才会这样。一个简单的例子：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|5728652|>+gsub(pattern="+\\.(?!\\.)",replacement=".",x="Something+.",+perl=TRUE)
[1]+"Something."
>+gsub(pattern="+\\.(?!\\.)",replacement=".",x="Something+..",+perl=TRUE)
[1]+"Something+.."|code-block|syntax|javascript|5728653|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

You can accomplish what you want using the negative look ahead pattern in Perl regular expressions. This basically says to match the pattern, but only if not followed by this pattern. A quick example:

<pre><code>&gt; gsub(pattern=" \\.(?!\\.)",replacement=".",x="Something .", perl=TRUE)
[1] "Something."
&gt; gsub(pattern=" \\.(?!\\.)",replacement=".",x="Something ..", perl=TRUE)
[1] "Something .."
</code></pre>

I am having trouble with a regular expression in R. The goal is to parse a Markdown/reST/knitr report text file in R to remove my own custom comments. These comments are put in the following form: 

<pre><code>Some sentence is about something &lt;find a citation to this&gt;.
</code></pre>

As Markdown uses &lt;> for HTML tags, I need to remove these comments (with my custom function) to avoid confusion. After I do that, the sentence takes the following form: 

<pre><code>Some sentence is about something .
</code></pre>

Note the space between the last word and the dot. It is easy to remove that, but then the text might contain reST comments incorporating R code (knitr) with beginning with <code>..</code>: 

<pre><code>.. {r chunk-name}
.. some R code 
.. ..
</code></pre>

So basically I need to replace the " ." in the former case, but not in the latter. I though I would achieve this using the repetition modifier of R regexp atoms: 

<pre><code>gsub(pattern=" \\.{1}",replacement=".",x="Something ..")
[1] "Something.."
</code></pre>

I was expecting that this expression would match a single space followed by a single (but not more) dots. However the string gets replaced regardless of whether there is one dot or two. I am a real newbie with this, so probably missing something obvious. Even so, any help will greatly appreciated. 

Regards,
Maxim

Regular expressions in R: pattern repetitions with {}

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

我在使用R中的正则表达式时遇到了问题。我的目标是在R中解析Markdown/reST/knitr报告文本文件，以删除我自己的自定义注释。这些评论的格式如下：Some sentence is about something <find a citation to this>.因为Markdown对HTML标签使用<>，所...

问R中的正则表达式:带有{}的模式重复
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R中的正则表达式:带有{}的模式重复EN