v.2.1.5

Job c9baedb1 - CrBVI

Status
Finished
Finished at
2021/02/03 20:12
Finished in
3 min
Mail notification:
roman.martin@uni-marburg.de
Notification date:
2021/04/01 16:09

Annotation results

Your genome annotation results are ready and at least available until 2021/09/03 20:12.

Click on a button to receive the related files or view the results dynamically on the genome web browser.

SQN Submission File Genome Feature Table
View results online

Please cite always:

  • Roman Martin, Thomas Hackl, Georges Hattab, Matthias G Fischer, Dominik Heider (2020). MOSGA: Modular Open-Source Genome Annotator. Bioinformatics. 36(22-23). 5514–5515. doi: 10.1093/bioinformatics/btaa1003.
  • Roman Martin, Hagen Dreßler, Georges Hattab, Thomas Hackl, Matthias G Fischer, Dominik Heider (2021). MOSGA 2: Comparative genomics and validation tools. Computational and Structural Biotechnology Journal. 19. 5504-5509. doi: 10.1016/j.csbj.2021.09.024.

Do you have some questions, issues or just would like to give us feedback? Please don't hesitate to write us or feel free to open a new issue on Gitlab.

Scaffold Repeats Gene tRNA Total
Total 4871 7592 44 12507
VLTN01000001.1 172 304 0 476
VLTN01000002.1 131 292 0 423
VLTN01000003.1 119 282 0 401
VLTN01000004.1 71 254 3 328
VLTN01000005.1 104 207 0 311
VLTN01000006.1 120 187 1 308
VLTN01000007.1 74 162 0 236
VLTN01000008.1 76 122 1 199
VLTN01000009.1 121 127 3 251
VLTN01000010.1 97 147 0 244
VLTN01000011.1 111 115 0 226
VLTN01000012.1 89 144 0 233
VLTN01000013.1 95 154 0 249
VLTN01000014.1 80 130 1 211
VLTN01000015.1 63 153 0 216
VLTN01000016.1 55 158 4 217
VLTN01000017.1 121 125 2 248
VLTN01000018.1 52 150 1 203
VLTN01000019.1 58 136 0 194
VLTN01000020.1 77 127 0 204
VLTN01000021.1 72 134 0 206
VLTN01000022.1 84 78 0 162
VLTN01000023.1 49 116 0 165
VLTN01000024.1 36 112 0 148
VLTN01000025.1 58 79 1 138
VLTN01000026.1 58 96 1 155
VLTN01000027.1 48 110 9 167
VLTN01000028.1 62 88 0 150
VLTN01000029.1 99 56 1 156
VLTN01000030.1 75 82 0 157
VLTN01000031.1 83 77 0 160
VLTN01000032.1 48 91 0 139
VLTN01000033.1 51 93 0 144
VLTN01000034.1 67 65 0 132
VLTN01000035.1 33 96 0 129
VLTN01000036.1 50 77 0 127
VLTN01000037.1 49 75 0 124
VLTN01000038.1 64 60 0 124
VLTN01000039.1 48 69 0 117
VLTN01000040.1 30 79 0 109
VLTN01000041.1 33 74 0 107
VLTN01000042.1 42 62 0 104
VLTN01000043.1 49 57 0 106
VLTN01000044.1 33 50 0 83
VLTN01000045.1 40 74 0 114
VLTN01000046.1 45 80 0 125
VLTN01000047.1 10 78 8 96
VLTN01000048.1 33 68 0 101
VLTN01000049.1 54 51 0 105
VLTN01000050.1 45 69 0 114
VLTN01000051.1 48 31 0 79
VLTN01000052.1 37 44 0 81
VLTN01000053.1 29 63 0 92
VLTN01000054.1 20 62 0 82
VLTN01000055.1 40 35 0 75
VLTN01000056.1 38 51 0 89
VLTN01000057.1 38 47 0 85
VLTN01000058.1 20 63 0 83
VLTN01000059.1 31 47 0 78
VLTN01000060.1 25 49 0 74
VLTN01000061.1 27 45 0 72
VLTN01000062.1 45 30 0 75
VLTN01000063.1 29 44 0 73
VLTN01000064.1 27 28 0 55
VLTN01000065.1 20 39 0 59
VLTN01000066.1 36 40 0 76
VLTN01000067.1 12 32 2 46
VLTN01000068.1 22 50 0 72
VLTN01000069.1 53 29 0 82
VLTN01000070.1 37 37 0 74
VLTN01000071.1 31 47 0 78
VLTN01000072.1 19 47 0 66
VLTN01000073.1 32 65 0 97
VLTN01000074.1 19 24 0 43
VLTN01000075.1 33 25 0 58
VLTN01000076.1 21 32 0 53
VLTN01000077.1 21 44 0 65
VLTN01000078.1 22 19 0 41
VLTN01000079.1 39 20 0 59
VLTN01000080.1 16 14 3 33
VLTN01000081.1 12 31 0 43
VLTN01000082.1 26 34 0 60
VLTN01000083.1 38 20 0 58
VLTN01000084.1 14 26 0 40
VLTN01000085.1 42 12 0 54
VLTN01000086.1 14 14 0 28
VLTN01000087.1 7 13 0 20
VLTN01000088.1 13 12 0 25
VLTN01000089.1 15 13 0 28
VLTN01000090.1 27 12 0 39
VLTN01000091.1 31 11 0 42
VLTN01000092.1 10 21 0 31
VLTN01000093.1 4 24 0 28
VLTN01000094.1 13 6 0 19
VLTN01000095.1 2 12 0 14
VLTN01000096.1 9 5 0 14
VLTN01000097.1 1 13 0 14
VLTN01000098.1 6 17 0 23
VLTN01000099.1 12 13 0 25
VLTN01000100.1 1 5 0 6
VLTN01000101.1 22 8 1 31
VLTN01000102.1 0 6 0 6
VLTN01000103.1 3 6 0 9
VLTN01000104.1 4 8 0 12
VLTN01000105.1 4 13 0 17
VLTN01000106.1 3 5 0 8
VLTN01000107.1 22 4 0 26
VLTN01000108.1 2 10 0 12
VLTN01000109.1 10 4 0 14
VLTN01000110.1 5 4 0 9
VLTN01000111.1 8 2 0 10
VLTN01000112.1 1 6 0 7
VLTN01000113.1 4 5 0 9
VLTN01000114.1 10 5 0 15
VLTN01000115.1 1 2 0 3
VLTN01000116.1 6 0 0 6
VLTN01000117.1 24 1 0 25
VLTN01000118.1 2 3 0 5
VLTN01000119.1 1 0 0 1
VLTN01000120.1 2 1 0 3
VLTN01000121.1 0 2 0 2
VLTN01000122.1 0 4 0 4
VLTN01000123.1 3 6 0 9
VLTN01000124.1 1 0 0 1
VLTN01000125.1 9 3 0 12
VLTN01000126.1 9 2 0 11
VLTN01000127.1 1 2 0 3
VLTN01000128.1 10 4 0 14
VLTN01000129.1 2 4 0 6
VLTN01000130.1 3 4 0 7
VLTN01000131.1 5 1 0 6
VLTN01000132.1 9 1 0 10
VLTN01000133.1 3 4 0 7
VLTN01000134.1 2 2 0 4
VLTN01000135.1 2 3 0 5
VLTN01000136.1 1 2 0 3
VLTN01000137.1 5 2 0 7
VLTN01000138.1 1 3 0 4
VLTN01000139.1 7 3 0 10
VLTN01000140.1 2 2 0 4
VLTN01000141.1 0 3 0 3
VLTN01000142.1 0 5 0 5
VLTN01000143.1 2 4 0 6
VLTN01000144.1 0 1 0 1
VLTN01000145.1 0 4 0 4
VLTN01000146.1 0 5 0 5
VLTN01000147.1 5 3 0 8
VLTN01000148.1 10 0 0 10
VLTN01000149.1 0 4 0 4
VLTN01000151.1 1 2 0 3
VLTN01000152.1 3 1 0 4
VLTN01000154.1 1 4 0 5
VLTN01000155.1 8 3 0 11
VLTN01000156.1 1 0 0 1
VLTN01000159.1 1 3 0 4
VLTN01000160.1 0 2 0 2
VLTN01000161.1 0 1 0 1
VLTN01000162.1 1 2 0 3
VLTN01000163.1 1 0 0 1
VLTN01000164.1 0 1 0 1
VLTN01000165.1 8 2 0 10
VLTN01000166.1 0 2 0 2
VLTN01000167.1 1 2 0 3
VLTN01000168.1 1 1 0 2
VLTN01000169.1 0 1 0 1
CM017891.1 0 0 2 2

Annotation database Snakemake configuration Snakemake log Validation File Validation File Error Summary Discrepancy Report What to cite

Single outputs
Genome Annotation GFF Feature Table Writing Organelle Scan Mitos Plastids WindowMasker Import RepeatMasker Import RepeatMasker Results RepeatMasker Stats tRNAscan-SE 2 Import tRNAscan-SE 2 Results Barrnap Import Barrnap Results SILVA LSU Import SILVA LSU Results Swiss-Prot Database Results EggNog 5 Database Results

What to cite
Morgulis A, Gertz EM, Schäffer AA, Agarwala R (2006). WindowMasker: window-based masker for sequenced genomes. Bioinformatics. 22(2):134‐141.
Seemann T. barrnap 0.9 : rapid ribosomal RNA prediction. https://github.com/tseemann/barrnap
Martin R, Hackl T, Hattab G, Fischer MG, Heider D (2020). MOSGA: Modular Open-Source Genome Annotator. Bioinformatics. 36(22-23):5514-5515. doi: 10.1093/bioinformatics/btaa1003
Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, Von Mering C, Bork P (2017). Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol Biol Evol.
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO (2013). The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41 (D1): D590-D596.
Hoff KJ, Lomsadze A, Borodovsky M, Stanke M (2019). Whole-Genome Annotation with BRAKER. Methods Mol Biol. 1962:65-95.
Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: inference of RNA alignments (2009). Bioinformatics 25(10):1335-7. Erratum in: Bioinformatics. 2009 Jul 1;25(13):1713. doi: 10.1093/bioinformatics/btp157
Hoff, K. J. and Stanke, M. (2019). Predicting Genes in Single Genomes with AUGUSTUS. Current Protocols in Bioinformatics, 65(1).
Stanke M, Diekhans M, Baertsch, R. and Haussler D (2008). Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics.
Martin R, Dreßler H, Hattab G, Hackl T, Fischer MG, Heider D (2021). MOSGA 2: Comparative genomics and validation tools. bioRxiv 2021.07.29.454382. doi: 10.1101/2021.07.29.454382
Kim D, Langmead B, Salzberg SL (2015). HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 12(4):357-60. doi: 10.1038/nmeth.3317
Huerta-Cepas J, Szklarczyk D, Heller D, Hernández-Plaza A, Forslund SK, Cook H, Mende DR, Letunic I, Rattei T, Jensen LJ, Von Mering C, Bork P (2019). eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47.
Stanke M, Schöffmann O, Morgenstern, B, Waack S (2006). Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7, 62.
Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M (2016). BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics, 32(5):767-769.
Brůna, T., Hoff, K.J., Lomsadze, A., Stanke, M., & Borodovsky, M. (2020). BRAKER2: Automatic Eukaryotic Genome Annotation with GeneMark-EP+ and AUGUSTUS Supported by a Protein Database, NAR Genomics and Bioinformatics 3(1):lqaa108, doi: 10.1093/nargab/lqaa108.
Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G, Goodstein DM, Elsik CG, Lewis SE, Stein L, Holmes IH (2016). JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol. 12;17:66. doi: 10.1186/s13059-016-0924-1
Bairoch, A., Boeckmann, B., Ferro, S., and Gasteiger, E. (2004). Swiss-Prot: juggling between evolution and stability. Briefings in Bioinformatics ,5(1), 39–55.
Langmead B, Salzberg SL (2012). Fast gapped-read alignment with Bowtie 2. Nat Methods. 4;9(4):357-9. doi: 10.1038/nmeth.1923
Chan, P.P., Lin, B., and Lowe, T.M (2019). tRNAscan-SE 2.0: Improved Detection and Functional Classification of Transfer RNA Genes. BioRxiv. doi: 10.1101/614032
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009). 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. A 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352
Buchfink B, Xie C, Huson DH (2015). Fast and sensitive protein alignment using DIAMOND. Nat Methods. 12(1):59‐60.
Smit AFA. RepeatMasker. <a target="_blank" href="http://www.repeatmasker.org">URL</a>.

Upload your assembled FASTA eukaryotic genome file.

Priority (highest priority first)

    CpG island detection


    CpG island detection


    Splicing site detection


    Functional Enrichment Analysis


    Protein-Protein Interactions Analysis


    Protein-Protein Interactions Analysis



    Choose your tools:

    Genes

    Gene
    Protein-coding genes

    Prediction of gene locations and splice sites.

    Mode of work

    Evidence-based or ab initio prediction.

    Functional Annotation

    Functional gene prediction.

    Repeats

    Repeats

    Detection of repeating sequences.

    tRNAs

    tRNA

    Prediction of tRNA sequences.

    rRNAs

    rRNA

    Search for rRNA sequence matches.

    Assembly Validation

    Genome Completeness

    Validate genome completeness.

    Quality-Control

    Contamination detection.


    UID Name Files Submission Date Start date End date Mode Status

    The Modular Open-Source Genome Annotator (MOSGA) is a pipeline that easily creates draft genome annotation by a graphical user interface. It combines several specific prediction tools and generates a submission-ready annotation file.

    The source code is freely available on Gitlab.com or Zenodo.com (DOI: 10.5281/zenodo.5121228). We recommend building a new docker container from the available Dockerfile in the linked Gitlab repository. MOSGA is written modular and allows easy integration of new prediction tools or even including whole third-party pipelines.

    For any questions or comments, please contact us: roman.martin@uni-marburg.de. We are happy to receive new suggestions or even merge requests for a pipeline extension. To provide an overview of the operation principle, we recommend reading our Gitlab wiki page.

    We are providing an example data set of the draft genome annotation of Cafeteria roenbergensis BVI strain. Initially, we used an early version of MOSGA to annotate this genome (Hackl et al., 2020). Hackl, T., Martin, R., Barenhoff, K. et al. Four high-quality draft genome assemblies of the marine heterotrophic nanoflagellate Cafeteria roenbergensis. Sci Data 7, 29 (2020).

    We provide two examples for the comparative genomics workflow: The Saccharomyces species phylogenetics and the Saccharomyces gene comparison. An exemplary annotation job for the organelle scanner based on the Nannochloropsis oceanica genome is here available.

    Please take care about the licenses of the selected tools.

    Whenever you use MOSGA please cite us:
    Roman Martin orcid, Thomas Hackl orcid, Georges Hattab orcid, Matthias Fischer orcid, Dominik Heider orcid (2020). MOSGA: Modular Open-Source Genome Annotator. Bioinformatics. 36(22-23). 5514–5515. doi: 10.1093/bioinformatics/btaa1003.

    Roman Martin orcid, Hagen Dreßler orcid, Georges Hattab orcid, Thomas Hackl orcid, Matthias Fischer orcid, Dominik Heider orcid (2021). MOSGA 2: Comparative genomics and validation tools. Computational and Structural Biotechnology Journal. 19. 5504-5509. doi: 10.1016/j.csbj.2021.09.024.

    The Philipps University of Marburg hosts this MOSGA instance for demonstration purposes. It runs on an AMD Zen processor with 16 threads and 32 GB of memory.

    We preserve the last 100 job submissions online until that limit exceeds. After that, we delete the oldest submission job that is at least more aged than 14 days. Incoming jobs are queued and processed as soon as possible. Computation tasks that stress our hardware longer than 48 hours could be terminated. We recommend not to upload files that are larger than 2 GiB.

    We reserve the right to analyze failed jobs to determine errors and provide bug fixes and quality improvements. Your results will still not be shared and regularly delete.

    If you provide a notification email address, we may contact you if your job failed to avoid or fix the issue.


    Server usage