我有一个以选项卡分隔的文本文件,其中包含一列文件路径,例如table.txt
> SampleID Factor Condition Replicate Treatment Type Dataset isPE ReadLength isREF PathFASTQ
> DG13 fd3 c1 1 cc 0 0102 0 50 1 "/path/to/fastq"
> DG14 fd3 c1 1 cc 1 0102 0 50 1 "/path/to/fastq"我希望将路径存储在bash数组中,以便在下游并行计算(SGE任务数组)中使用这些路径。为了简单起见,引导和尾随的"很容易不包含在table.txt中。
不包括标题行,我尝试了以下操作:
files=($(awk '{ if(($8 == 0)) { print $1} }' table.txt ))
paths=($(awk '{ if(($8 == 0)) { print $11} }' table.txt ))
infile="${paths[$SGE_TASK_ID]}"/"${files[$SGE_TASK_ID]}".fastq.gz$SGE_TASK_ID在(1-N)之间取一个用户定义的整数值,以防有人不知道.
不幸的是,$infile没有显示$SGE_TASK_ID=1的期望值。
/path/to/fastq/DG13.fastq.gz谢谢你的帮助。
发布于 2020-02-19 05:18:56
请您尝试以下操作,此代码将在代码运行期间删除Control字符。
myarr=($(awk '{gsub(/\r/,"")} match($NF,/\/[^"]*/){\
val=substr($NF,RSTART,RLENGTH);\
num=split(val,array,"/");\
print val"/"$1"."array[num]".gz"}' Input_file))
for i in "${myarr[@]}"
do
echo $i
done如果要从Input_file本身中删除控件M字符,请尝试运行以下命令:
tr -d '\r' < Input_file > temp && mv temp Input_file当我们按上面所示的循环打印数组时,输出将如下所示。
/path/to/fastq/DG13.fastq.gz
/path/to/fastq/DG14.fastq.gzawk 代码的解释:
awk ' ##Starting awk program from here.
match($NF,/\/[^"]*/){ ##Using match function of awk program here, match everything till " in last field.
val=substr($NF,RSTART,RLENGTH) ##Creating variable val which is sub-string where starting point is RSTART till value of RLENGTH.
num=split(val,array,"/") ##Creating variable num whose value is number of elements plitted by split, splitting val into array with / is delimiter.
print val"/"$1"."array[num]".gz" ##Printing val / first field DOT array last element then .gz here.
}
' Input_file ##Mentioning Input_file name here.发布于 2020-02-19 05:51:43
请您试一试:
while read -r -a ary; do
((nr++)) || continue # skip header line
if (( ${ary[7]} == 0 )); then # if "isPE" == 0 ..
path=${ary[10]#\"} # remove leading double-quote
path=${path%\"} # remove trailing double-quote
file=${ary[0]}
infile[$((++SGE_TASK_ID))]="${path}/${file}.fastq.gz"
fi
done < table.txt
echo "${infile[1]}"
echo "${infile[2]}"输出:
/path/to/fastq/DG13.fastq.gz
/path/to/fastq/DG14.fastq.gzhttps://stackoverflow.com/questions/60293147
复制相似问题