Ask Your Question

Revision history [back]

Hi Krystel,

I am sorry you are having issues creating your reference sequence track.

Our Data Convert Wizard is very specific in the supported file formats and extensions for each type of conversion. In particular, for the FASTA converter the files must be *.fa or *.fasta or the *.gz versions for the wizard to pick the correct convert options.

If you are downloading your reference sequence from the NCBI FTP site for the Enterobacteria phage lambda virus genome then the file with the closest supported format is NC_001416.fna. Before adding the file to the converter you will first want to rename the extension to be *.fa.

image description

Then on the next dialog you will need to rename the segments (chromosome/scaffolds) present in the file since NCBI files contain more than just chromosome names in the headers for their FASTA files.

Additionally if you have BAM files for this species then you will want to make sure the segment names you select for the convert wizard match exactly to what is contained in the header of the BAM file. In the below screenshot I have chosen to rename the segment to "1" and then listed the NCBI naming convention "NC_001416.1" as the Alias name. Either the "Segment" or "Alias Name" can be used to match the BAM file naming convention.

image description

Also at this point you will want to create a Build Name for your genome. We generally use the assembly name given by NCBI, but for this genome an assembly name does not exist so you can just give it an informative name that you will recognize, keeping in mind that if you create any further data sources for this genome you will want to use the same naming convention.

If you continue to have issues creating and using the reference sequence for this genome please let me know. If you could provide a link to the file you are trying to convert or provide a copy of the file to me at genomebrowse@goldenhelix.com that would be most useful.

Thanks, Jami...