blocks|key|88770|text|以下是适用于该示例的一些内容。正如@m.flick提到的，请尝试以可重现的方式分享您的数据。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|88771|Data|offset|length|style|BOLD|88772|>+dput(txt)
c("Next+follow+up+on+6/13/2018+between+12+PM+and+2+PM+PST.",+
"will+call+you+tomorrow+between+4+-+6PM+EST.",+"will+call+you+tomorrow+between+12:00PM+to+2:00PM+CST",+
"will+call+you+tomorrow+between+11+AM+to+12+PM+EST",+"Next+follow+up+on+6/13/2018+between+12+PM+TO+2+PM+PST."
)|code-block|syntax|javascript|88773|代码|88774|>+regmatches(txt,+regexec('[[:space:]]([[:digit:]]{1,2}[[:space:]].*[[:upper:]]{3})',+txt))
[[1]]
[1]+"+12+PM+and+2+PM+PST"+"12+PM+and+2+PM+PST"+

[[2]]
[1]+"+4+-+6PM+EST"+"4+-+6PM+EST"+

[[3]]
character(0)

[[4]]
[1]+"+11+AM+to+12+PM+EST"+"11+AM+to+12+PM+EST"+

[[5]]
[1]+"+12+PM+TO+2+PM+PST"+"12+PM+TO+2+PM+PST"|88775|输出是一个列表，其中每个元素都有两个字符向量(请参阅regmatches的帮助部分)。您可以进一步简化此操作，以仅获得上面指出的输出：|CODE|88776|>+unname(sapply(txt,+function(z){
+++pattern+<-+'[[:space:]]([[:digit:]]{1,2}([[:space:]]%7C:).*[[:upper:]]{3})'
+++k+<-+unlist(regmatches(z,+regexec(pattern+=+pattern,+z)))
+++return(k[2])
+}))
[1]+"12+PM+and+2+PM+PST"++++"4+-+6PM+EST"+++++++++++"12:00PM+to+2:00PM+CST"+"11+AM+to+12+PM+EST"+++
[5]+"12+PM+TO+2+PM+PST"+|88777|这是基于样本输入的。当然，如果输入太不规则，就很难使用单个正则表达式。如果遇到这种情况，我建议使用多个正则表达式函数，根据前面的函数是否返回NA，一个接一个地调用。希望这能对你有所帮助！|88778|entityMap^0|0|0|4|0|0|0|2|0|0|Q|A|0|0|1Y|2|0^^$0|@$1|2|3|4|5|6|7|Z|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|10|8|@$D|11|E|12|F|G]]|9|@]|A|$]]|$1|H|3|I|5|J|7|13|8|@]|9|@]|A|$K|L]]|$1|M|3|N|5|6|7|14|8|@$D|15|E|16|F|G]]|9|@]|A|$]]|$1|O|3|P|5|J|7|17|8|@]|9|@]|A|$K|L]]|$1|Q|3|R|5|6|7|18|8|@$D|19|E|1A|F|S]]|9|@]|A|$]]|$1|T|3|U|5|J|7|1B|8|@]|9|@]|A|$K|L]]|$1|V|3|W|5|6|7|1C|8|@$D|1D|E|1E|F|S]]|9|@]|A|$]]|$1|X|3|-4|5|6|7|1F|8|@]|9|@]|A|$]]]|Y|$]]

Here's something that works for the sample. As @MrFlick mentioned, please try to share your data in a reproducible way. 

Data

<pre><code>&gt; dput(txt)
c("Next follow up on 6/13/2018 between 12 PM and 2 PM PST.", 
"will call you tomorrow between 4 - 6PM EST.", "will call you tomorrow between 12:00PM to 2:00PM CST", 
"will call you tomorrow between 11 AM to 12 PM EST", "Next follow up on 6/13/2018 between 12 PM TO 2 PM PST."
)
</code></pre>

code

<pre><code>&gt; regmatches(txt, regexec('[[:space:]]([[:digit:]]{1,2}[[:space:]].*[[:upper:]]{3})', txt))
[[1]]
[1] " 12 PM and 2 PM PST" "12 PM and 2 PM PST" 

[[2]]
[1] " 4 - 6PM EST" "4 - 6PM EST" 

[[3]]
character(0)

[[4]]
[1] " 11 AM to 12 PM EST" "11 AM to 12 PM EST" 

[[5]]
[1] " 12 PM TO 2 PM PST" "12 PM TO 2 PM PST"
</code></pre>

the output is a list wherein each element has two character vectors (read the help section for <code>regmatches</code>). You can simplify this further to get only the output indicated above: 

<pre><code>&gt; unname(sapply(txt, function(z){
 pattern &lt;- '[[:space:]]([[:digit:]]{1,2}([[:space:]]|:).*[[:upper:]]{3})'
 k &lt;- unlist(regmatches(z, regexec(pattern = pattern, z)))
 return(k[2])
 }))
[1] "12 PM and 2 PM PST" "4 - 6PM EST" "12:00PM to 2:00PM CST" "11 AM to 12 PM EST" 
[5] "12 PM TO 2 PM PST" 
</code></pre>

This based on the sample input. Of course if the input is far too irregular, it'll be hard to use a single regex. If you have such a case, I'd recommend using multiple regex functions that are called one after the other depending on if the preceding ones return <code>NA</code>. Hope this is helpful!

blocks|key|2612214|text|这段代码几乎适用于你的所有规范，除了这个子串"4+-6+6PM“。我希望它对你的所有数据都有用|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2612215|++data=c(

++"Next+follow+up+on+6/13/2018+between+12+PM+and+2+PM+PST.",

++"will+call+you+tomorrow+between+4+-+6PM+EST.",

++"will+call+you+tomorrow+between+12:00PM+to+2:00PM+CST",

++"will+call+you+tomorrow+between+11+AM+to+12+PM+EST",

++"Next+follow+up+on+6/13/2018+between+12+PM+TO+2+PM+PST.")



++#date+exclusion+with+regex
++data=gsub(+"*(\\d{1,2}/\\d{1,2}/\\d{4})*",+"",+data)


++#parameters+for+exlusion+and+substitution#
++excluded_texts=c("Next+follow+up+on","between","will+call+you+tomorrow",":00","\\.")
++replaced_input=c("++","\'-","and","TO","+AM","+PM")
++replaced_output=c("","to","to","to","AM","PM")

++for+(i+in+excluded_texts){
++++data=gsub(i,+"",+data)}

++for+(j+in+1:length(replaced_input)){
++++data=gsub(replaced_input[j],replaced_output[j],data)

++}

print(data)|code-block|syntax|javascript|2612216|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

This code works for almost all your specifications, excepted this substring "4 - 6PM EST". I hope it would be useful on your whole data

<pre><code> data=c(

 "Next follow up on 6/13/2018 between 12 PM and 2 PM PST.",

 "will call you tomorrow between 4 - 6PM EST.",

 "will call you tomorrow between 12:00PM to 2:00PM CST",

 "will call you tomorrow between 11 AM to 12 PM EST",

 "Next follow up on 6/13/2018 between 12 PM TO 2 PM PST.")



 #date exclusion with regex
 data=gsub( "*(\\d{1,2}/\\d{1,2}/\\d{4})*", "", data)


 #parameters for exlusion and substitution#
 excluded_texts=c("Next follow up on","between","will call you tomorrow",":00","\\.")
 replaced_input=c(" ","\'-","and","TO"," AM"," PM")
 replaced_output=c("","to","to","to","AM","PM")

 for (i in excluded_texts){
 data=gsub(i, "", data)}

 for (j in 1:length(replaced_input)){
 data=gsub(replaced_input[j],replaced_output[j],data)

 }

print(data)
</code></pre>

blocks|key|2612218|text|sub(".*?(\\d%2B\\s*[PA:-].*)","\\1",data)
[1]+"12+PM+and+2+PM+PST."+++"4+-+6PM+EST."++++++++++"12:00PM+to+2:00PM+CST"
[4]+"11+AM+to+12+PM+EST"++++"12+PM+TO+2+PM+PST."+|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|2612219|unstyled|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|G|8|@]|9|@]|A|$B|C]]|$1|D|3|-4|5|E|7|H|8|@]|9|@]|A|$]]]|F|$]]

I have a large database of text, read as data frame with one column of text which has few sentences with time mentioned in different formats as below:

Row 1. I tried to call you on xxx-xxx-xxxx, however reached voice mail I'm scheduling our next follow up on 6/13/2018 between 12 PM and 2 PM PST.

Row 2. I will call you again today if I hear something from them, if not, will call you tomorrow between 4 - 6PM EST.

Row 3. We will await for your reply, if we don't hear from you then we will call you tomorrow between 12:00PM to 2:00PM CST

Row 4. As discussed over the call, we scheduled call back for tomorrow between 12 - 02 PM EST. 

Row 5. As suggested by you, we will have our next follow up on 6/13/2018 between 12 PM TO 2 PM PST. 

Would like to extract just the time part along with EST/CST/PST.

<blockquote>
 Expected Outputs:
 
 6/13/2018 4 PM - 6 PM EST 
 tomorrow 12 PM TO 2 PM PST
</blockquote>

Have tried the below:

<code>x &lt;- text$string</code>

<code>sc1 &lt;- str_match(x, " follow up on (.*?) T.")</code> 

which returns something like:

<blockquote>
 follow up on 6/13/2018 between 1 PM TO | 6/13/2018 between 1 PM
</blockquote>

Tried to combine other formats using below codes

<code>sc2 &lt;- str_match(x, " will call you tomorrow between (.*?) T.")</code> 

and do a rowbind to include both formats (follow up * and will call you*)

<code>sc1rb &lt;- rbind(sc1,sc2)</code> 

which did not workk

Any way to extract only the time part along with timezone from the above example strings?

Thanks in advance!

R - Extract time along with timezone which is part of string

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

我有一个很大的文本数据库，以数据框的形式阅读，其中有一列文本，其中有几个句子，以不同的格式提到时间，如下所示：第1行。我试着用xxx-xxx-xxxx给你打电话，但是打到了语音信箱。我计划在太平洋标准时间2018年6月13日中午12点到下午2点之间进行下一次跟进。第二行。如果我听到他们的消息，我今天会再打电话给你，如果...

问R-提取时间以及作为字符串一部分的时区
EN

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R-提取时间以及作为字符串一部分的时区EN

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R-提取时间以及作为字符串一部分的时区
EN