blocks|key|42145|text|使用正则表达式：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|42146|>>>+s+=+"<+stuff+to+remove>+get+this+stuff+<stuff+to+remove>"
>>>+import+re
>>>+re.sub(r'<[%5E<>]*>',+'',+s)
'+get+this+stuff+'|code-block|syntax|javascript|42147|表达式<[%5E<>]*>匹配以<开头、以>结尾、中间没有<或>的字符串。然后，sub命令将匹配替换为空字符串，从而将其删除。|offset|length|style|CODE|42148|然后，如果需要，可以对结果调用.strip()来删除前导空格和尾随空格。|42149|当然，当您有嵌套标记时，这将失败，但它将适用于您的示例。|42150|entityMap^0|0|0|3|8|E|1|J|1|R|1|T|1|12|3|0|F|8|0|0^^$0|@$1|2|3|4|5|6|7|S|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|T|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|U|8|@$I|V|J|W|K|L]|$I|X|J|Y|K|L]|$I|Z|J|10|K|L]|$I|11|J|12|K|L]|$I|13|J|14|K|L]|$I|15|J|16|K|L]]|9|@]|A|$]]|$1|M|3|N|5|6|7|17|8|@$I|18|J|19|K|L]]|9|@]|A|$]]|$1|O|3|P|5|6|7|1A|8|@]|9|@]|A|$]]|$1|Q|3|-4|5|6|7|1B|8|@]|9|@]|A|$]]]|R|$]]

Use a regular expression:

<pre><code>&gt;&gt;&gt; s = "&lt; stuff to remove&gt; get this stuff &lt;stuff to remove&gt;"
&gt;&gt;&gt; import re
&gt;&gt;&gt; re.sub(r'&lt;[^&lt;&gt;]*&gt;', '', s)
' get this stuff '
</code></pre>

The expression <code>&lt;[^&lt;&gt;]*&gt;</code> matches strings that start with <code>&lt;</code>, end with <code>&gt;</code>, and have neither <code>&lt;</code> or <code>&gt;</code> in between. The <code>sub</code> command then replaces the match with the empty string, thus deleting it.

You can then call <code>.strip()</code> on the result to remove the leading and trailing spaces if you want.

Of course, this will fail when you have, for example, nested tags, but it will work for your example.

blocks|key|2090586|text|正则表达式是一种简单的方法(尽管不一定像jedwards的答案所示的那样更快)：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2090587|import+re
s+=+'<+stuff+to+remove>+get+this+stuff+<stuff+to+remove>'
s+=+re.sub(r'<[%5E>]*>',+'',+s)|code-block|syntax|javascript|2090588|在此s之后将是字符串'+get+this+stuff+'。|offset|length|style|CODE|2090589|entityMap^0|0|0|2|1|A|I|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|Q|8|@$I|R|J|S|K|L]|$I|T|J|U|K|L]]|9|@]|A|$]]|$1|M|3|-4|5|6|7|V|8|@]|9|@]|A|$]]]|N|$]]

Regular expressions would be a simple way to do this (although not necessarily faster as shown by jedwards' answer):

<pre><code>import re
s = '&lt; stuff to remove&gt; get this stuff &lt;stuff to remove&gt;'
s = re.sub(r'&lt;[^&gt;]*&gt;', '', s)
</code></pre>

After this <code>s</code> would be the string <code>' get this stuff '</code>.

blocks|key|1685678|text|我不确定你正在做的搜索操作是否是问题的一部分。如果你只是说你有一个开始索引和一个结束索引，并且你想从一个字符串中删除这些字符，你不需要一个特殊的函数。Python允许您对字符串中的字符使用数字索引。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1685679|>+x="abcdefg"
>+x[1:3]
'bc'|code-block|syntax|javascript|1685680|您想要执行的操作应该类似于x[:strt_idx]+%2B+x[end_idx:]。(如果省略第一个参数，则表示“从头开始”；如果省略第二个参数，则表示“继续到结尾”。)|offset|length|style|CODE|1685681|entityMap^0|0|0|D|Q|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|Q|8|@$I|R|J|S|K|L]]|9|@]|A|$]]|$1|M|3|-4|5|6|7|T|8|@]|9|@]|A|$]]]|N|$]]

I'm not sure whether the search operation you're doing is part of the question. If you're just saying that you have a start index and an end index and you want to remove those characters from a string, you don't need a special function for that. Python lets you use numeric indices for the characters in strings. 

<pre><code>&gt; x="abcdefg"
&gt; x[1:3]
'bc'
</code></pre>

The operation you want to perform would be something like <code>x[:strt_idx] + x[end_idx:]</code> . (if you omit the first argument it means "start from the beginning" and if you omit the second one it means "continue to the end".)

blocks|key|1075098|text|如果你有字符串的开始和结束索引，你可以这样做：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1075099|substring+=+string[s_ind:e_ind]|code-block|syntax|javascript|1075100|其中，s_ind是要包含在字符串中的第一个字符的索引，e_ind是不希望包含在字符串中的第一个字符的索引。|offset|length|style|CODE|1075101|例如|1075102|string+=+"Long+string+of+which+I+only+want+a+small+part"
#+++++++++012345678901234567890123456789012345678901234
#+++++++++0+++++++++1+++++++++2+++++++++3
substring+=+string[21:32]
print+substring|1075103|打印I+only+want|1075104|您可以使用与现在相同的方式查找索引。|1075105|1075106|编辑：关于效率，这种类型的解决方案实际上是比正则表达式解决方案更有效的。原因是正则表达式中包含了很多您不一定需要的开销。|BOLD|1075107|我鼓励你自己测试这些东西，而不是盲目地去做人们所说的最有效的事情。|1075108|考虑以下测试程序：|1075109|#!/bin/env+python

import+re
import+time

def+inner_regex(s):
++++return+re.sub(r'<[%5E>]*>',+'',+s)

def+inner_substr(s):
++++s_ind+=+s.find('>')+%2B+1
++++e_ind+=+s.find('<',+s_ind)
++++return+s[s_ind:e_ind]


s+=+'<stuff+to+remove>+get+this+stuff+<stuff+to+remove>'

tr1+=+time.time()
for+i+in+range(100000):
++++s1+=+inner_regex(s)
tr2+=+time.time()
print("Regex:+++++%25f"+%25+(tr2+-+tr1))

ts1+=+time.time()
for+i+in+range(100000):
++++s2+=+inner_substr(s)
ts2+=+time.time()
print("Substring:+%25f"+%25+(ts2+-+ts1))|1075110|输出为：|1075111|Regex:+++++0.511443
Substring:+0.148062|1075112|换句话说，使用正则表达式方法，您的比原来的修正后的方法慢3倍以上。|1075113|1075114|编辑：关于编译的正则表达式的注释，它比未编译的正则表达式更快，但仍然比显式的子字符串慢：|1075115|#!/bin/env+python

import+re
import+time

def+inner_regex(s):
++++return+re.sub(r'<[%5E>]*>',+'',+s)

def+inner_regex_compiled(s,r):
++++return+r.sub('',+s)

def+inner_substr(s):
++++s_ind+=+s.find('>')+%2B+1
++++e_ind+=+s.find('<',+s_ind)
++++return+s[s_ind:e_ind]


s+=+'<stuff+to+remove>+get+this+stuff+<stuff+to+remove>'


tr1+=+time.time()
for+i+in+range(100000):
++++s1+=+inner_regex(s)
tr2+=+time.time()


tc1+=+time.time()
r+=+re.compile(r'<[%5E>]*>')
for+i+in+range(100000):
++++s2+=+inner_regex_compiled(s,r)
tc2+=+time.time()


ts1+=+time.time()
for+i+in+range(100000):
++++s3+=+inner_substr(s)
ts2+=+time.time()


print("Regex:++++++++++%25f"+%25+(tr2+-+tr1))
print("Regex+Compiled:+%25f"+%25+(tc2+-+tc1))
print("Substring:++++++%25f"+%25+(ts2+-+ts1))|1075116|返回：|1075117|Regex:++++++++++0.512799++#+>3+times+slower
Regex+Compiled:+0.297863++#+~2+times+slower
Substring:++++++0.144910|1075118|这个故事的寓意：虽然正则表达式是工具箱中的一个有用的工具，但它们在可用时并不像更直接的方式那样有效。|1075119|而且，不要轻信别人的话，因为你很容易测试自己。|1075120|entityMap^0|0|0|3|5|R|5|0|0|0|2|B|0|0|0|0|3|L|E|0|0|0|0|0|0|0|0|0|3|0|0|0|0|5|3|0|0^^$0|@$1|2|3|4|5|6|7|1P|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|1Q|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|1R|8|@$I|1S|J|1T|K|L]|$I|1U|J|1V|K|L]]|9|@]|A|$]]|$1|M|3|N|5|6|7|1W|8|@]|9|@]|A|$]]|$1|O|3|P|5|D|7|1X|8|@]|9|@]|A|$E|F]]|$1|Q|3|R|5|6|7|1Y|8|@$I|1Z|J|20|K|L]]|9|@]|A|$]]|$1|S|3|T|5|6|7|21|8|@]|9|@]|A|$]]|$1|U|3|-4|5|6|7|22|8|@]|9|@]|A|$]]|$1|V|3|W|5|6|7|23|8|@$I|24|J|25|K|X]|$I|26|J|27|K|X]]|9|@]|A|$]]|$1|Y|3|Z|5|6|7|28|8|@]|9|@]|A|$]]|$1|10|3|11|5|6|7|29|8|@]|9|@]|A|$]]|$1|12|3|13|5|D|7|2A|8|@]|9|@]|A|$E|F]]|$1|14|3|15|5|6|7|2B|8|@]|9|@]|A|$]]|$1|16|3|17|5|D|7|2C|8|@]|9|@]|A|$E|F]]|$1|18|3|19|5|6|7|2D|8|@]|9|@]|A|$]]|$1|1A|3|-4|5|6|7|2E|8|@]|9|@]|A|$]]|$1|1B|3|1C|5|6|7|2F|8|@$I|2G|J|2H|K|X]]|9|@]|A|$]]|$1|1D|3|1E|5|D|7|2I|8|@]|9|@]|A|$E|F]]|$1|1F|3|1G|5|6|7|2J|8|@]|9|@]|A|$]]|$1|1H|3|1I|5|D|7|2K|8|@]|9|@]|A|$E|F]]|$1|1J|3|1K|5|6|7|2L|8|@$I|2M|J|2N|K|X]]|9|@]|A|$]]|$1|1L|3|1M|5|6|7|2O|8|@]|9|@]|A|$]]|$1|1N|3|-4|5|6|7|2P|8|@]|9|@]|A|$]]]|1O|$]]

If you have the starting and ending index of the string, you could do something like:

<pre><code>substring = string[s_ind:e_ind]
</code></pre>

Where <code>s_ind</code> is the index of the first character you want to include in the string and <code>e_ind</code> is the index of the first character you don't want in the string.

For example

<pre><code>string = "Long string of which I only want a small part"
# 012345678901234567890123456789012345678901234
# 0 1 2 3
substring = string[21:32]
print substring
</code></pre>

prints <code>I only want</code>

You could find the indices in the same manner you are now. 

<hr>

Edit: Regarding efficiency, this type of solution is actually more efficient than the regex solution. The reason is there is a lot of overhead involved in regular expressions that you don't necessarily need. 

I encourage you to test these things for yourself instead of blindly going on what people claim is most efficient.

Consider the following test program:

<pre><code>#!/bin/env python

import re
import time

def inner_regex(s):
 return re.sub(r'&lt;[^&gt;]*&gt;', '', s)

def inner_substr(s):
 s_ind = s.find('&gt;') + 1
 e_ind = s.find('&lt;', s_ind)
 return s[s_ind:e_ind]


s = '&lt;stuff to remove&gt; get this stuff &lt;stuff to remove&gt;'

tr1 = time.time()
for i in range(100000):
 s1 = inner_regex(s)
tr2 = time.time()
print("Regex: %f" % (tr2 - tr1))

ts1 = time.time()
for i in range(100000):
 s2 = inner_substr(s)
ts2 = time.time()
print("Substring: %f" % (ts2 - ts1))
</code></pre>

the output is:

<pre><code>Regex: 0.511443
Substring: 0.148062
</code></pre>

In other words, using the regex approach you are more than 3x slower than your original, corrected approach.

<hr>

Edit: Regarding the comment about compiled regex, it is faster than uncompiled regex, but still slower than the explicit substring:

<pre><code>#!/bin/env python

import re
import time

def inner_regex(s):
 return re.sub(r'&lt;[^&gt;]*&gt;', '', s)

def inner_regex_compiled(s,r):
 return r.sub('', s)

def inner_substr(s):
 s_ind = s.find('&gt;') + 1
 e_ind = s.find('&lt;', s_ind)
 return s[s_ind:e_ind]


s = '&lt;stuff to remove&gt; get this stuff &lt;stuff to remove&gt;'


tr1 = time.time()
for i in range(100000):
 s1 = inner_regex(s)
tr2 = time.time()


tc1 = time.time()
r = re.compile(r'&lt;[^&gt;]*&gt;')
for i in range(100000):
 s2 = inner_regex_compiled(s,r)
tc2 = time.time()


ts1 = time.time()
for i in range(100000):
 s3 = inner_substr(s)
ts2 = time.time()


print("Regex: %f" % (tr2 - tr1))
print("Regex Compiled: %f" % (tc2 - tc1))
print("Substring: %f" % (ts2 - ts1))
</code></pre>

Returns:

<pre><code>Regex: 0.512799 # &gt;3 times slower
Regex Compiled: 0.297863 # ~2 times slower
Substring: 0.144910
</code></pre>

Moral of the story: While regular expressions are a helpful tool to have in the toolbox, they're simply not as efficient as more straightforward ways when available.

And don't take people's word for things that you can easily test yourself.

So, I have bunch of long strings hence thinking of an efficient way to do this operation
Suppose I have a string something like

<pre><code> "&lt; stuff to remove&gt; get this stuff &lt;stuff to remove&gt;
</code></pre>

So, I am trying to extract "get this stuff"

So I am writing something like this.

<pre><code> strt_pos = 0
 end_pos = 0
 while True:
 strt_idx = string.find(start_point, strt_pos) # start_point = "&lt;" in our example
 end_idx = string.find(end_point, end_pos) # end_point = "&gt;" in our example
 chunk_to_remove = string[strt_idx:end_idx]
 # Now how do i chop this part off from the string??
 strt_pos = strt_pos + 1
 end_pos = end_pos + 1
 if str_pos &gt;= len(string) # or maybe end_pos &gt;= len(string):
 break
</code></pre>

What is the better way to implement this

removing string based on start index and end index

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

因此，我有一堆长字符串，因此想出一种有效的方法来执行此操作，假设我有一个字符串，如下所示 "< stuff to remove> get this stuff <stuff to remove>所以，我正在试着提取"get this this“所以我正在写一些类似这样的东西。 strt_pos = 0 end_pos...

问根据起始索引和结束索引删除字符串
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问根据起始索引和结束索引删除字符串EN