json数据大家统一用我给的测试数据,自己在浏览器打开下载:http://biotrainee.com/jbrowse/JBrowse-1.12.1/sample_data/json/modencode/modencodeMetaData.json
范例如下:
[AppleScript] 纯文本查看 复制代码
{
"types" : {
"data set" : {
"pluralLabel" : "data sets"
}
},
"items" : [
{
"technique" : "ChIP-chip",
"factor" : "BEAF-32",
"target" : "Non TF Chromatin binding factor",
"principal_investigator" : "White, K.",
"Tracks" : [
"fly/White_INSULATORS_WIG/BEAF32"
],
"submission" : "21",
"label" : "BEAF-32;Embryos 0-12 hr;ChIP-chip",
"category" : "Other chromatin binding sites",
"type" : "data set",
"Developmental-Stage" : "Embryos 0-12 hr",
"organism" : "D. melanogaster"
},
{
"technique" : "ChIP-chip",
"factor" : "CP190",
"target" : "Non TF Chromatin binding factor",
"principal_investigator" : "White, K.",
"Tracks" : [
"fly/White_INSULATORS_WIG/CP190"
],
"submission" : "22",
"label" : "CP190;Embryos 0-12 hr;ChIP-chip",
"category" : "Other chromatin binding sites",
"type" : "data set",
"Developmental-Stage" : "Embryos 0-12 hr",
"organism" : "D. melanogaster"
},
因为帖子长度有限,我就只截取了一部分,请自己下载查看,如果是完整的json,可以用在线工具查看结构:http://json.parser.online.fr/ 如果不懂json格式的,请自行搜索哈,现在TCGA在GDC的metadata信息,就是json格式的。 我们需要从这个json文件里面提取:technique factor target principal_investigator submission label category type Developmental-Stage organism key 这几列信息,当然,是可以用正则表达式做的。 完成之后应该是:http://biotrainee.com/jbrowse/JBrowse-1.12.1/sample_data/json/modencode/modencodeMetaData.csv 同样可以在浏览器打开并且下载用excel查看哈
我就不多做介绍了,主要难点在于理解json,本次作业,推荐大家用已有的包,正则表达式虽然可以做,但是太麻烦了~ 给一个perl代码如下; [Perl] 纯文本查看 复制代码
#!/usr/bin/env perl
use strict;
use warnings;
use autodie ':all';
use 5.10.0;
use JSON 2;
my $data = from_json( do { local $/; open my $f, '<', $ARGV[0]; scalar <$f> } );
my @fields = qw( technique factor target principal_investigator submission label category type Developmental-Stage organism key );
say join ',', map "\"$_\"", @fields;
for my $item ( @{$data->{items}} ) {
$item->{key} = $item->{label};
no warnings 'uninitialized';
for my $track ( @{$item->{Tracks}} ) {
$item->{label} = $track;
say join ',', map "\"$_\"", @{$item}{@fields};
}
}