AGELESSMolecularMicro

PAINT: Leishmania Sexual Reproductive Strategies As Resolved Through Compuational Methods Designed for Aneuploid Genomes

washUlogo
Jahangheer S. Shaik, Deborah E. Dobson, David L. Sacks and Stephen M. Beverley




MAIN

INDEX


ANALYTICAL PIPELINE

CONTACT

SYSTEM REQUIREMENTS

AGELESS Package  

Example Data

Generate Allele File


This page walks you though various steps required to generate an allele file using PAINT package.

Prerequisites

1) Align the fastq files to the reference genome of interest using your favourite aligner (Bowtie, BWA, Novoalign etc.)
2) Sort the generated Binary alignment map (BAM) file using samtools 'sort' function

How to Run it?

The input to the program is Sorted Binary Alignment Map (BAM) file or Sorted Sequence Alignment Map (SAM) file as specified in the prerequisite. The  findAlleles utility takes a sorted  bam file as input using -i option, an output file name as input using -o option and interval as input using -n option. -n option defines at what intervals the allele must be reported by the algorithm. if n=1, the algorithm reports the allele at each genomic position, if n=10, it reports allele every 10 bases and so on.

Command Example

java -jar PAINT.jar findAlleles -i "../DemoData/BAMS/FV1_SAT_srt.bam" -o "../DemoData/alleleFiles/FV1_SAT_srt.allele" -n 1

Output

The output of the findAllele program looks as follows:
LmjF01
9    3;0;0;0;1;2;326
10    0;0;4;0;2;2;476
11    0;0;4;0;2;2;476
12    0;0;4;0;2;2;476
13    0;6;0;0;3;3;776
14    8;0;0;0;4;4;1076
15    9;0;0;0;4;5;1226
16    0;0;9;0;4;5;1226
17    0;0;9;0;4;5;1226
18    0;0;9;0;4;5;1226
19    0;9;0;0;4;5;1226
20    9;0;0;0;4;5;1226
21    13;0;0;0;4;9;1686
22    0;0;15;0;5;10;1986
23    0;0;15;0;5;10;1986
........

LmjF02
.........
The chromosome name is listed first followed by each subsequent row showing the base position and allele composition.
For example, "22    0;0;15;0;5;10;1986" means the following:
  • 22    0;0;15;0;5;10;1986- Indicates base position 22 on chromosome LmjF01
  • 22    0;0;15;0;5;10;1986- Indicates allele composition is 0 As, 0 Ts, 15 Cs and 0 Gs. This locus is homozygous for 'C' with 100% allele frequency. Also, the coverage at this locus is 15 as it is covered by 15 reads contributing to 15 As.
  •  22    0;0;15;0;5;10;1986- Provides information regarding the orientation of the reads. In this case, there are 5 reads with forward orientation and 10 with reverse orientation that contribute to allele information
  • 22    0;0;15;0;5;10;1986- Indicates cumulative mapping quality of the reads

Consider a hypothetical situation where the following numbers are found "99  0;4;9;0;5;8;1999" , the allele composition is 0 As, 4 Ts, 9 Cs and 0 Gs.
This locus is heterozygous TC with T occuring with 30.76% allele frequency and C occuring with 69.24% allele frequency.
For this locus, when generating a contig, C will be reported as it occurs with higher allele frequency. Other options are provided if both alleles are to be reported.