前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Linux -文本处理 sed

Linux -文本处理 sed

原创
作者头像
用户10412487
发布2023-03-29 13:49:36
1.2K0
发布2023-03-29 13:49:36
举报
文章被收录于专栏:生信技能树-R生信技能树-R

sed (pic1)

pic1
pic1

sed 例子

代码语言:txt
复制
Mar402 10:42:55 ~
$ cat Data/readme.txt 
Welcome to Biotrainee() !
This is your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:09:08 ~
$ cat Data/readme.txt | sed '1a Welcome to Biotrainee() ' #在第一行后面添加
Welcome to Biotrainee() !
Welcome to Biotrainee() 
This is your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:09:44 ~
$ cat Data/readme.txt | sed '1,2i Hi ' #在第一行和第二行前添加Hi
Hi 
Welcome to Biotrainee() !
Hi 
This is your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:12:44 ~
$ cat Data/readme.txt | sed '1,3d' #删除1,3行
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:14:52 ~
$ cat Data/readme.txt | sed '/jmzeng1314@163.com/d' #将可以匹配到jmzeng1314@163.com的删掉
Welcome to Biotrainee() !
This is your personal account in our Cloud.
Have a fun with it.
(http://www.biotrainee.com/thread-1376-1-1.html)
# 若要多处行操作 用sed -e '' sed -e ''

改 sed 'c'改变多行内容 pic2

pic 2
pic 2
代码语言:txt
复制
#方法1:
Mar402 12:17:11 ~
$ cat Data/readme.txt | sed '2,4c *****'
Welcome to Biotrainee() !
*****
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:20:04 ~
$ cat Data/readme.txt | sed '2,4c *****\n*****\n*****'
Welcome to Biotrainee() !
*****
*****
*****
(http://www.biotrainee.com/thread-1376-1-1.html)
#方法2:
Mar402 12:21:13 ~
$ cat Data/readme.txt | sed '2,4 i ***** '
Welcome to Biotrainee() !
***** 
This is your personal account in our Cloud.
***** 
Have a fun with it.
***** 
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:23:06 ~
$ cat Data/readme.txt | sed -e '2,4 i *****' -e '2,4d' #注意第二个-e前不用写 sed
Welcome to Biotrainee() !
*****
*****
*****
(http://www.biotrainee.com/thread-1376-1-1.html)
#方法3:
Mar402 12:23:33 ~
$ cat Data/readme.txt | sed -e '2 c *****' -e '3 c *****' -e '4 c *****'
Welcome to Biotrainee() !
*****
*****
*****
(http://www.biotrainee.com/thread-1376-1-1.html)
#方法4:
Mar402 12:25:47 ~
$ cat Data/readme.txt | sed '2 c *****' | sed '3 c *****' | sed '4 c *****'
Welcome to Biotrainee() !
*****
*****
*****
(http://www.biotrainee.com/thread-1376-1-1.html)

sed 's///' 替换

代码语言:txt
复制
Mar402 12:33:38 ~
$ cat Data/readme.txt | sed 's/is/IS/g' #与vim相似 写g是所有都改
Welcome to Biotrainee() !
ThIS IS your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)

Mar402 12:33:45 ~
$ cat Data/readme.txt | sed 's/is/IS/2' #g变成2 是只替换第二个
Welcome to Biotrainee() !
This IS your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:39:36 ~
$ cat Data/readme.txt | sed '1~3s/ee/EE/' #替换第1,4,7行(1,3是从第一行开始隔3行进行替换)
Welcome to BiotrainEE() !
This is your personal account in our Cloud.
Have a fun with it.
Please fEEl free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:45:56 ~
$ cat Data/readme.txt | sed '/www/ s/ee/EE/' #将www这一行的ee替换为EE
Welcome to Biotrainee() !
This is your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainEE.com/thread-1376-1-1.html)

sed -n '//p' 查找

代码语言:txt
复制
Mar402 12:47:42 ~
$ cat Data/readme.txt | sed '/ee/p' #不加-n它会把所有的都输出一遍,含有ee的输出两边,所以要加-n
Welcome to Biotrainee() !
Welcome to Biotrainee() !
This is your personal account in our Cloud.
Have a fun with it.
Please feel free to contact with me( email to jmzeng1314@163.com )
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:50:31 ~
$ cat Data/readme.txt | sed -n '/ee/p' #只输出含有 ee的行
Welcome to Biotrainee() !
Please feel free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainee.com/thread-1376-1-1.html)
Mar402 12:50:39 ~
$ cat Data/readme.txt | sed -n 's/ee/EE/p' #查找和替换连用
Welcome to BiotrainEE() !
Please fEEl free to contact with me( email to jmzeng1314@163.com )
(http://www.biotrainEE.com/thread-1376-1-1.html)
Mar402 12:54:16 ~
$ cat Data/readme.txt | sed 'y/abcde/ABCDE/' #set 'y///'进行一对一的替换
WElComE to BiotrAinEE() !
This is your pErsonAl ACCount in our ClouD.
HAvE A fun with it.
PlEAsE fEEl frEE to ContACt with mE( EmAil to jmzEng1314@163.Com )
(http://www.BiotrAinEE.Com/thrEAD-1376-1-1.html)

Mar402 12:57:47 ~
$ cat Data/readme.txt | tr '[a-z]' '[A-Z]'
WELCOME TO BIOTRAINEE() !
THIS IS YOUR PERSONAL ACCOUNT IN OUR CLOUD.
HAVE A FUN WITH IT.
PLEASE FEEL FREE TO CONTACT WITH ME( EMAIL TO JMZENG1314@163.COM )
(HTTP://WWW.BIOTRAINEE.COM/THREAD-1376-1-1.HTML)
Mar402 12:58:03 ~
$ cat Data/readme.txt | sed 'y/[a-z]/[A-Z]/' #它是把a替换成A把-替换成-;把替换成Z 与tr是不同的
Welcome to BiotrAinee() !
This is your personAl Account in our Cloud.
HAve A fun with it.
PleAse feel free to contAct with me( emAil to jmZeng1314@163.com )
(http://www.biotrAinee.com/threAd-1376-1-1.html)

练习2

练习2
练习2
代码语言:txt
复制
Mar402 12:58:43 ~
1.$ head Data/example.gtf 
chr1    ENSEMBL UTR     1737    2090    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL exon    1737    2090    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL transcript      1737    4275    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    HAVANA  gene    1737    4275    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENSG00000223972"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1"; level 2; havana_gene "OTTHUMG00000000961";
chr1    HAVANA  exon    1873    1920    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    HAVANA  transcript      1873    3533    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    HAVANA  exon    2042    2090    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    HAVANA  exon    2476    2560    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    ENSEMBL UTR     2476    2584    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL exon    2476    2584    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";

Mar402 13:03:08 ~
2.$ head Data/example.gtf | sed 's/HAVANA/ENSEMBL/'
chr1    ENSEMBL UTR     1737    2090    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL exon    1737    2090    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL transcript      1737    4275    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL gene    1737    4275    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENSG00000223972"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1"; level 2; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL exon    1873    1920    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    ENSEMBL transcript      1873    3533    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    ENSEMBL exon    2042    2090    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    ENSEMBL exon    2476    2560    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "unprocessed_pseudogene"; transcript_status "KNOWN"; transcript_name "RP11-34P13-001"; level 2; havana_gene "OTTHUMG00000000961"; havana_transcript "OTTHUMT00000002844"; ont "PGO:0000005";
chr1    ENSEMBL UTR     2476    2584    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
chr1    ENSEMBL exon    2476    2584    .       +       .       gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";

3.Mar402 13:14:16 ~
$ head Data/example.fa 
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAA
TATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC
ATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG
CCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC
AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTG
AAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT
GACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCGCAATTGAAAACTTTCGTCGATCAGGAATTT
Mar402 13:13:07 ~
$ head Data/example.fa | sed '2,$ y/ATCG/TAGC/' #替换为互补序列(没有反向)
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
TCGAAAAGTAAGACTGACGTTGCCCGTTATACAGAGACACACCTAATTTTTTTCTCACAGACTATCGTCG
AAGACTTGACCAATGGACGGCACTCATTTAATTTTAAAATAACTGAATCCAGTGATTTATGAAATTGGTT
ATATCCGTATCGCGTGTCTGTCTATTTTTAATGTCTCATGTGTTGTAGGTACTTTGCGTAATCGTGGTGG
TAATGGTGGTGGTAGTGGTAATGGTGTCCATTGCCACGCCCGACTGCGCATGTCCTTTGTGTCTTTTTTC
GGGCGTGGACTGTCACGCCCGAAAAAAAAAGCTGGTTTCCATTGCTCCATTGTTGGTACGCTCACAACTT
CAAGCCGCCATGTAGTCACCGTTTACGTCTTGCAAAAGACGCACAACGGCTATAAGACCTTTCGTTACGG
TCCGTCCCCGTCCACCGGTGGCAGGAGAGACGGGGGCGGTTTTAGTGGTTGGTGGACCACCGCTACTAAC
TTTTTTGGTAATCGCCGGTCCTACGAAATGGGTTATAGTCGCTACGGCTTGCATAAAAACGGCTTGAAAA
CTGCCCTGAGCGGCGGCGGGTCGGCCCCAAGGGCGACCGCGTTAACTTTTGAAAGCAGCTAGTCCTTAAA

Mar402 13:25:46 ~
$ head Data/example.fa | sed '2,$ y/ATCG/TAGC/' > tmp.fa #保存
Mar402 13:25:53 ~
$ cat tmp.fa
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
TCGAAAAGTAAGACTGACGTTGCCCGTTATACAGAGACACACCTAATTTTTTTCTCACAGACTATCGTCG
AAGACTTGACCAATGGACGGCACTCATTTAATTTTAAAATAACTGAATCCAGTGATTTATGAAATTGGTT
ATATCCGTATCGCGTGTCTGTCTATTTTTAATGTCTCATGTGTTGTAGGTACTTTGCGTAATCGTGGTGG
TAATGGTGGTGGTAGTGGTAATGGTGTCCATTGCCACGCCCGACTGCGCATGTCCTTTGTGTCTTTTTTC
GGGCGTGGACTGTCACGCCCGAAAAAAAAAGCTGGTTTCCATTGCTCCATTGTTGGTACGCTCACAACTT
CAAGCCGCCATGTAGTCACCGTTTACGTCTTGCAAAAGACGCACAACGGCTATAAGACCTTTCGTTACGG
TCCGTCCCCGTCCACCGGTGGCAGGAGAGACGGGGGCGGTTTTAGTGGTTGGTGGACCACCGCTACTAAC
TTTTTTGGTAATCGCCGGTCCTACGAAATGGGTTATAGTCGCTACGGCTTGCATAAAAACGGCTTGAAAA
CTGCCCTGAGCGGCGGCGGGTCGGCCCCAAGGGCGACCGCGTTAACTTTTGAAAGCAGCTAGTCCTTAAA

4.Mar402 13:26:03 ~
$ cat Data/md5.txt | sed '1d'
d57df747bc142e9850074d512ab9d6db;3331c6a9e0183ff9d398a3292dd45f66       SRR1039508_1.fastq.gz;SRR1039508_2.fastq.gz
49400c5685f36f830a277a59004b119d;ab4410a432cc18c1b9f10f93634e5310       SRR1039509_1.fastq.gz;SRR1039509_2.fastq.gz
d2c2d92c67c943648fdde6c70bc0d920;3e4223e08b97f37f3da17d686739e75c       SRR1039510_1.fastq.gz;SRR1039510_2.fastq.gz
4073b1519608c24c0c1119b580dfd9eb;2fcb23d5fb63e322d80cd3cab75faa0b       SRR1039511_1.fastq.gz;SRR1039511_2.fastq.gz
a35f30576f25ea548c7b3a28895a81cf;83bbe3c587d9477938826ea19c53a281       SRR1039512_1.fastq.gz;SRR1039512_2.fastq.gz
b3073b5b057f24208ac1853fdd4b5875;945cb34259d6dbf0362fe9018f769de4;ecb43490d03c9b325352e70488d58611      SRR1039513.fastq.gz;SRR1039513_1.fastq.gz;SRR1039513_2.fastq.gz
ae35fe0ce13badacc48c65717e811528;9ef4fe59d6378c513f933e24d12f6047       SRR1039514_1.fastq.gz;SRR1039514_2.fastq.gz
929b988eb5730eba77aeac98bf8be35f;c674d2ea79835165828b37258abbc925;5640a85f2c181d4886e905e74a32f041      SRR1039515.fastq.gz;SRR1039515_1.fastq.gz;SRR1039515_2.fastq.gz
8f97b3dc8170ecd6fffb39101c3e5bf5;2c4d2ba3b812f14bce25966c98b5b5df;8599c02799338b9514e8d0077a8409e4      SRR1039516.fastq.gz;SRR1039516_1.fastq.gz;SRR1039516_2.fastq.gz
1f2796f07033ec3bfab0981bd0674bb9;008ba2b3b589d553e3e9f8890d5481c2       SRR1039517_1.fastq.gz;SRR1039517_2.fastq.gz
64d1444ad727f48066aeb6ad314d9190;a24eea863bdca0284591fcd5eb076a93       SRR1039518_1.fastq.gz;SRR1039518_2.fastq.gz
f11f41c013ffaf3a031c9836ce81e6ef;9283f111ef774248f6f666e4bf2b1f81;9bcb6c9675631b1dcb8b07f6916d546c      SRR1039519.fastq.gz;SRR1039519_1.fastq.gz;SRR1039519_2.fastq.gz
d8251c87ba3c803d4344c2b24c77b19d;ca8e0014e7ba56982adc37439cea0755;62838f21e66ec78030b51ee6019420ef      SRR1039520.fastq.gz;SRR1039520_1.fastq.gz;SRR1039520_2.fastq.gz
637e08d030778c6581731647f3c3d8cc;4be82ad33d7d4990bed3c4bc701dc070;435aa5e48ba77e4c42218930a0be0de1      SRR1039521.fastq.gz;SRR1039521_1.fastq.gz;SRR1039521_2.fastq.gz
789e86036c81a85d2c1f014f79822d64;54c572cead4074b126f0b81b344af1be;c461a163b72a71efb4027045e6b4d2f6      SRR1039522.fastq.gz;SRR1039522_1.fastq.gz;SRR1039522_2.fastq.gz
ae33f7f6d536d020a2562b8be6e9cc33;083213dc45820db2eb62d66b89e77ce9       SRR1039523_1.fastq.gz;SRR1039523_2.

如何得到单行序列的反向互补?

代码语言:txt
复制
Mar402 13:27:05 ~
$ head -2 Data/example.fa | sed '1d'
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
Mar402 13:33:48 ~
$ head -2 Data/example.fa | sed '1d' | sed 'y/ATCG/TAGC/'
TCGAAAAGTAAGACTGACGTTGCCCGTTATACAGAGACACACCTAATTTTTTTCTCACAGACTATCGTCG
Mar402 13:35:33 ~
$ head -2 Data/example.fa | sed '1d' | sed 'y/ATCG/TAGC/' | rev
GCTGCTATCAGACACTCTTTTTTTAATCCACACAGAGACATATTGCCCGTTGCAGTCAGAATGAAAAGCT

多行序列的反向互补

代码语言:txt
复制
Mar402 13:38:13 ~
$ head  Data/example.fa | sed '2,$ y/ATCG/TAGC/' #互补
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
TCGAAAAGTAAGACTGACGTTGCCCGTTATACAGAGACACACCTAATTTTTTTCTCACAGACTATCGTCG
AAGACTTGACCAATGGACGGCACTCATTTAATTTTAAAATAACTGAATCCAGTGATTTATGAAATTGGTT
ATATCCGTATCGCGTGTCTGTCTATTTTTAATGTCTCATGTGTTGTAGGTACTTTGCGTAATCGTGGTGG
TAATGGTGGTGGTAGTGGTAATGGTGTCCATTGCCACGCCCGACTGCGCATGTCCTTTGTGTCTTTTTTC
GGGCGTGGACTGTCACGCCCGAAAAAAAAAGCTGGTTTCCATTGCTCCATTGTTGGTACGCTCACAACTT
CAAGCCGCCATGTAGTCACCGTTTACGTCTTGCAAAAGACGCACAACGGCTATAAGACCTTTCGTTACGG
TCCGTCCCCGTCCACCGGTGGCAGGAGAGACGGGGGCGGTTTTAGTGGTTGGTGGACCACCGCTACTAAC
TTTTTTGGTAATCGCCGGTCCTACGAAATGGGTTATAGTCGCTACGGCTTGCATAAAAACGGCTTGAAAA
CTGCCCTGAGCGGCGGCGGGTCGGCCCCAAGGGCGACCGCGTTAACTTTTGAAAGCAGCTAGTCCTTAAA
Mar402 13:38:47 ~
$ head  Data/example.fa | sed '2,$ y/ATCG/TAGC/' | sed '1d' #删除第一行
TCGAAAAGTAAGACTGACGTTGCCCGTTATACAGAGACACACCTAATTTTTTTCTCACAGACTATCGTCG
AAGACTTGACCAATGGACGGCACTCATTTAATTTTAAAATAACTGAATCCAGTGATTTATGAAATTGGTT
ATATCCGTATCGCGTGTCTGTCTATTTTTAATGTCTCATGTGTTGTAGGTACTTTGCGTAATCGTGGTGG
TAATGGTGGTGGTAGTGGTAATGGTGTCCATTGCCACGCCCGACTGCGCATGTCCTTTGTGTCTTTTTTC
GGGCGTGGACTGTCACGCCCGAAAAAAAAAGCTGGTTTCCATTGCTCCATTGTTGGTACGCTCACAACTT
CAAGCCGCCATGTAGTCACCGTTTACGTCTTGCAAAAGACGCACAACGGCTATAAGACCTTTCGTTACGG
TCCGTCCCCGTCCACCGGTGGCAGGAGAGACGGGGGCGGTTTTAGTGGTTGGTGGACCACCGCTACTAAC
TTTTTTGGTAATCGCCGGTCCTACGAAATGGGTTATAGTCGCTACGGCTTGCATAAAAACGGCTTGAAAA
CTGCCCTGAGCGGCGGCGGGTCGGCCCCAAGGGCGACCGCGTTAACTTTTGAAAGCAGCTAGTCCTTAAA
Mar402 13:39:16 ~
$ head  Data/example.fa | sed '2,$ y/ATCG/TAGC/' | sed '1d' | rev #单行反向
GCTGCTATCAGACACTCTTTTTTTAATCCACACAGAGACATATTGCCCGTTGCAGTCAGAATGAAAAGCT
TTGGTTAAAGTATTTAGTGACCTAAGTCAATAAAATTTTAATTTACTCACGGCAGGTAACCAGTTCAGAA
GGTGGTGCTAATGCGTTTCATGGATGTTGTGTACTCTGTAATTTTTATCTGTCTGTGCGCTATGCCTATA
CTTTTTTCTGTGTTTCCTGTACGCGTCAGCCCGCACCGTTACCTGTGGTAATGGTGATGGTGGTGGTAAT
TTCAACACTCGCATGGTTGTTACCTCGTTACCTTTGGTCGAAAAAAAAAGCCCGCACTGTCAGGTGCGGG
GGCATTGCTTTCCAGAATATCGGCAACACGCAGAAAACGTTCTGCATTTGCCACTGATGTACCGCCGAAC
CAATCATCGCCACCAGGTGGTTGGTGATTTTGGCGGGGGCAGAGAGGACGGTGGCCACCTGCCCCTGCCT
AAAAGTTCGGCAAAAATACGTTCGGCATCGCTGATATTGGGTAAAGCATCCTGGCCGCTAATGGTTTTTT
AAATTCCTGATCGACGAAAGTTTTCAATTGCGCCAGCGGGAACCCCGGCTGGGCGGCGGCGAGTCCCGTC
Mar402 13:39:22 ~
$ head  Data/example.fa | sed '2,$ y/ATCG/TAGC/' | sed '1d' | rev | tac #行行反向 
AAATTCCTGATCGACGAAAGTTTTCAATTGCGCCAGCGGGAACCCCGGCTGGGCGGCGGCGAGTCCCGTC
AAAAGTTCGGCAAAAATACGTTCGGCATCGCTGATATTGGGTAAAGCATCCTGGCCGCTAATGGTTTTTT
CAATCATCGCCACCAGGTGGTTGGTGATTTTGGCGGGGGCAGAGAGGACGGTGGCCACCTGCCCCTGCCT
GGCATTGCTTTCCAGAATATCGGCAACACGCAGAAAACGTTCTGCATTTGCCACTGATGTACCGCCGAAC
TTCAACACTCGCATGGTTGTTACCTCGTTACCTTTGGTCGAAAAAAAAAGCCCGCACTGTCAGGTGCGGG
CTTTTTTCTGTGTTTCCTGTACGCGTCAGCCCGCACCGTTACCTGTGGTAATGGTGATGGTGGTGGTAAT
GGTGGTGCTAATGCGTTTCATGGATGTTGTGTACTCTGTAATTTTTATCTGTCTGTGCGCTATGCCTATA
TTGGTTAAAGTATTTAGTGACCTAAGTCAATAAAATTTTAATTTACTCACGGCAGGTAACCAGTTCAGAA
GCTGCTATCAGACACTCTTTTTTTAATCCACACAGAGACATATTGCCCGTTGCAGTCAGAATGAAAAGCT
#原始序列
Mar402 13:43:15 ~
$ head Data/example.fa
>gi|556503834|ref|NC_000913.3| Escherichia coli str. K-12 substr. MG1655, complete genome
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAA
TATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC
ATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG
CCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCC
AGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTG
AAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT
GACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCGCAATTGAAAACTTTCGTCGATCAGGAATTT

----来自生信技能树----

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
作者已关闭评论
0 条评论
热度
最新
推荐阅读
目录
  • sed (pic1)
  • sed 例子
  • 改 sed 'c'改变多行内容 pic2
  • sed 's///' 替换
  • sed -n '//p' 查找
  • 练习2
  • 如何得到单行序列的反向互补?
  • 多行序列的反向互补
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档