MAIN
INDEX
ANALYTICAL PIPELINE
CONTACT
SYSTEM REQUIREMENTS
AGELESS Package
Example Data |
Generate Allele File This page walks you though various steps required to generate an allele file using PAINT package.
Prerequisites1) Align the fastq files to the reference genome of interest using your favourite aligner (Bowtie, BWA, Novoalign etc.) 2) Sort the generated Binary alignment map (BAM) file using samtools 'sort' function
How to Run it?The input to the program is Sorted
Binary Alignment Map (BAM) file or Sorted
Sequence Alignment Map (SAM) file as specified in the prerequisite. The findAlleles utility takes a sorted bam file as input using -i option, an
output
file name as input using -o option and interval as input using -n
option. -n option defines at what intervals the allele must be reported
by the algorithm. if n=1, the algorithm reports the
allele at each genomic position, if n=10, it reports allele every 10
bases and so on.
Command Examplejava
-jar PAINT.jar findAlleles -i "../DemoData/BAMS/FV1_SAT_srt.bam" -o "../DemoData/alleleFiles/FV1_SAT_srt.allele" -n 1
OutputThe
output of the findAllele program looks as follows:
LmjF01 9 3;0;0;0;1;2;326 10 0;0;4;0;2;2;476 11 0;0;4;0;2;2;476 12 0;0;4;0;2;2;476 13 0;6;0;0;3;3;776 14 8;0;0;0;4;4;1076 15 9;0;0;0;4;5;1226 16 0;0;9;0;4;5;1226 17 0;0;9;0;4;5;1226 18 0;0;9;0;4;5;1226 19 0;9;0;0;4;5;1226 20 9;0;0;0;4;5;1226 21 13;0;0;0;4;9;1686 22 0;0;15;0;5;10;1986 23 0;0;15;0;5;10;1986
........
LmjF02
......... | The chromosome name is listed first followed by each
subsequent row showing the base position and allele
composition. For
example, "22 0;0;15;0;5;10;1986" means the following:
- 22 0;0;15;0;5;10;1986- Indicates base position 22 on
chromosome LmjF01
- 22 0;0;15;0;5;10;1986- Indicates
allele composition is 0 As, 0 Ts, 15 Cs
and 0 Gs. This locus is homozygous for 'C' with 100% allele frequency.
Also, the coverage at this locus is 15 as it is covered by 15 reads
contributing to 15 As.
- 22 0;0;15;0;5;10;1986- Provides
information regarding the orientation of the reads. In this case, there
are 5 reads with forward orientation and 10 with reverse orientation
that contribute to allele information
- 22 0;0;15;0;5;10;1986- Indicates cumulative mapping quality of the reads
Consider a hypothetical situation where the following numbers are found "99 0;4;9;0;5;8;1999" , the
allele composition is 0 As, 4 Ts, 9 Cs and 0 Gs.
This locus is heterozygous TC with T occuring with 30.76% allele
frequency and C occuring with 69.24% allele frequency.
For this locus, when generating a contig, C will be reported as it
occurs with higher allele frequency. Other options are provided if both alleles are to be reported. |
|