Ask Your Question

Revision history [back]

Hi Christine,

The reference sequence for any particular genome will only contains one allele at any given position. This single allele is selected from the pool of sample genomes used in creating the assembly for that genome and is generally the allele that either occurs the most for a group of samples or has the most reads associated with that location on the genome depending on sequencing methods used. Each sequencing consortium can have their own guidelines for picking the "reference" allele that is included in a particular genome assembly.

In particular for the Bos taurus BTAU_4.6.1 genome available in GenomeBrowse the reference sequence was assembled from the pooled reads of BACs and whole shotgun sequences from two cattle, one male and one female. You can find specifics on how the assembly was created from the NCBI website.

When viewing BAM files in GenomeBrowse the reads in the coverage plots will be color coded according to whether they match the reference sequence for your species and build. Any reads that support the reference will be gray and the alternate alleles will be blue, green, yellow, or red. So if variant calling was performed on your BAM file any potential heterozygous calls that exist could look similar to the below screenshot.

image description

Please let me know if I can clarify further.

Thanks, Jami...