bioinformatics - How to use linux command to extract sequencing data -
i extract lines , following sequencing data.
there ecoli.ffn
file follows:
$head ecoli.ffn >ecoli16:g027092:gcf_000460315:gi|545267691|ref|nz_ke701669.1|:551259-572036 atgagcctgattattgatgttatttcgcgt aaaacatccgtcaaacaaacgctgattaat >ecoli16:g000011:55989:gi|218693476|ref|nc_011748.1|:1128430-1131042 gtgtacgctatggcgggtaattttgccgat >ecoli16:g000012:55989:gi|218693476|ref|nc_011748.1|:1128430-1131042 gtgtacgctatggcgggtaattttgccgat ctgacagctgttcttacactggattcaacc ctgacagctgttcttacactggattcaacc
and index.txt following
$head index.txt g000011 g000012
what want "extract index.txt ecoli.ffn", ideal output is:
>ecoli16:g000011:55989:gi|218693476|ref|nc_011748.1|:1128430-1131042 gtgtacgctatggcgggtaattttgccgat >ecoli16:g000012:55989:gi|218693476|ref|nc_011748.1|:1128430-1131042 gtgtacgctatggcgggtaattttgccgat ctgacagctgttcttacactggattcaacc ctgacagctgttcttacactggattcaacc
how can this?
write simple script ecoli.sh using awk:
#!/bin/bash a=`cat index.txt` in $a cat ecoli.ffn|awk -f: -v i="$i" 'begin{flag=0} {if($2 == i){print $0;flag=1;} if(flag ==1 && $2 != i){print $0; flag=0;} }' done
then need run script in shell.
Comments
Post a Comment