首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如何将第一个空格后面的大写字母替换为行间隔?

如何将第一个空格后面的大写字母替换为行间隔?
EN

Stack Overflow用户
提问于 2015-11-22 19:34:51
回答 3查看 68关注 0票数 0

所以我有这篇文章(它有一千多行):

代码语言:javascript
运行
复制
ABO blood group antigens Carbohydrate antigens attached mainly to cell surface proteins or lipids that are present on many cell types, including red blood cells. These antigens differ among individuals, depending on inherited alleles encoding the enzymes required for synthesis of the carbohydrate antigens. The ABO antigens act as alloantigens that are responsible for blood transfusion reactions and hyperacute rejection of allografts.

Acquired immunodeficiency A deficiency in the immune system that is acquired after birth, usually because of infection (e.g., AIDS), and that is not related to a genetic defect. Synonymous with secondary immunodeficiency.

Acquired immunodeficiency syndrome (AIDS) A disease caused by human immunodeficiency virus (HIV) infection that is characterized by depletion of CD4+ T cells, leading to a profound defect in cell-mediated immunity. Clinically, AIDS includes opportunistic infections, malignant tumors, wasting, and encephalopathy.

Activation-induced cell death (AICD) Apoptosis of activated lymphocytes, generally used for T cells.

Activation-induced (cytidine) deaminase (AID) An enzyme expressed in B cells that catalyzes the conversion of cytosine into uracil in DNA, which is a step required for somatic hypermutation and affinity maturation of antibodies and for Ig class switching.

Activation protein 1 (AP-1) A family of DNA-binding transcription factors composed of dimers of two proteins that bind to one another through a shared structural motif called a leucine zipper. The best-characterized AP-1 factor is composed of the proteins Fos and Jun. AP-1 is involved in transcriptional regulation of many different genes that are important in the immune system, such as cytokine genes.

我希望是这样:

代码语言:javascript
运行
复制
ABO blood group antigens
Carbohydrate antigens attached mainly to cell surface proteins or lipids that are present on many cell types, including red blood cells. These antigens differ among individuals, depending on inherited alleles encoding the enzymes required for synthesis of the carbohydrate antigens. The ABO antigens act as alloantigens that are responsible for blood transfusion reactions and hyperacute rejection of allografts.

Acquired immunodeficiency
A deficiency in the immune system that is acquired after birth, usually because of infection (e.g., AIDS), and that is not related to a genetic defect. Synonymous with secondary immunodeficiency.

Acquired immunodeficiency syndrome (AIDS)
A disease caused by human immunodeficiency virus (HIV) infection that is characterized by depletion of CD4+ T cells, leading to a profound defect in cell-mediated immunity. Clinically, AIDS includes opportunistic infections, malignant tumors, wasting, and encephalopathy.

Activation-induced cell death (AICD)
Apoptosis of activated lymphocytes, generally used for T cells.

Activation-induced (cytidine) deaminase (AID)
An enzyme expressed in B cells that catalyzes the conversion of cytosine into uracil in DNA, which is a step required for somatic hypermutation and affinity maturation of antibodies and for Ig class switching.

Activation protein 1 (AP-1)
A family of DNA-binding transcription factors composed of dimers of two proteins that bind to one another through a shared structural motif called a leucine zipper. The best-characterized AP-1 factor is composed of the proteins Fos and Jun. AP-1 is involved in transcriptional regulation of many different genes that are important in the immune system, such as cytokine genes.

有办法绕过它吗?我不是程序员。谢谢。

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2015-11-22 19:50:33

我在本地测试了你的文本,这起作用了,我不是正则表达式专家,所以它可能不是最有效的。

使用“替换”选项卡(Ctrl+H):

找到什么:^(.*?) ([A-Z].*$)

替换为:\1\r\n\2

确保检查匹配情况正则表达式

解释

找出什么:

代码语言:javascript
运行
复制
^           starts with
.           anything
*           repeated 0 or more times
?           lazy match so that it stops at the capital letter (next group)
(.*?)       remember that part (group 1)
            followed by a space
[A-Z]       match the capital letter
.           anything
*           repeated 0 or more times
$           ends with
([A-Z].*$)  remember that part (group 2)

代之以

代码语言:javascript
运行
复制
\1          group 1
\r          carriage return
\n          new line
\2          group 2
票数 0
EN

Stack Overflow用户

发布于 2015-11-22 19:51:44

您需要使用正则表达式进行替换(查找一个空格,后面跟着大写字母)。

在notepad++中使用查找/替换正则表达式(确保选中"Match Case")

找到什么:(^.)(A)改为:\1\r\n\2

票数 0
EN

Stack Overflow用户

发布于 2015-11-22 20:28:07

是,

使用perl脚本。这个很管用我觉得..。

代码语言:javascript
运行
复制
#!/usr/bin/perl
$cestbon = 0;
while (<>) {
@line = split(" ",$_);
if (/^$/) {
            $cestbon = 0;
    print "\n";
    }
foreach (@line) {
    if (/\b[A-Z][a-z0-9]*\b/ && $cestbon < 2) {
      print "\n$_ ";
      $cestbon++;
    } else {
      print "$_ ";
    }
}
}

去运行它!因为这是在MBP上运行OS,也就是UNIX。

cat sample.txt /sample.pl

代码语言:javascript
运行
复制
ABO blood group antigens 
Carbohydrate antigens attached mainly to cell surface proteins or lipids that are present on many cell types, including red blood cells. 
These antigens differ among individuals, depending on inherited alleles encoding the enzymes required for synthesis of the carbohydrate antigens. The ABO antigens act as alloantigens that are responsible for blood transfusion reactions and hyperacute rejection of allografts. 

Acquired immunodeficiency 
A deficiency in the immune system that is acquired after birth, usually because of infection (e.g., AIDS), and that is not related to a genetic defect. Synonymous with secondary immunodeficiency. 

Acquired immunodeficiency syndrome (AIDS) 
A disease caused by human immunodeficiency virus (HIV) infection that is characterized by depletion of CD4+ T cells, leading to a profound defect in cell-mediated immunity. Clinically, AIDS includes opportunistic infections, malignant tumors, wasting, and encephalopathy. 

Activation-induced cell death (AICD) 
Apoptosis of activated lymphocytes, generally used for T cells. 

Activation-induced (cytidine) deaminase (AID) 
An enzyme expressed in B cells that catalyzes the conversion of cytosine into uracil in DNA, which is a step required for somatic hypermutation and affinity maturation of antibodies and for Ig class switching. 

Activation protein 1 (AP-1) 
A family of DNA-binding transcription factors composed of dimers of two proteins that bind to one another through a shared structural motif called a leucine zipper. The best-characterized AP-1 factor is composed of the proteins Fos and Jun. AP-1 is involved in transcriptional regulation of many different genes that are important in the immune system, such as cytokine genes.

也许不是很完美,但我在10分钟内就写好了,所以让我休息一下:)

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/33859286

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档