The curse of the uncultured fungus
Journal article, 2022

The international DNA sequence databases abound in fungal sequences not annotated beyond the kingdom level, typically bearing names such as “uncultured fungus”. These sequences beget low-resolution mycological results and invite further deposition of similarly poorly annotated entries. What do these sequences represent? This study uses a 767,918-sequence corpus of public full-length fungal ITS sequences to estimate what proportion of the 95,055 “uncultured fungus” sequences that represent truly unidentifiable fungal taxa – and what proportion of them that would have been straightforward to annotate to some more meaningful taxonomic level at the time of sequence deposition. Our results suggest that more than 70% of these sequences would have been trivial to identify to at least the order/family level at the time of sequence deposition, hinting that factors other than poor availability of relevant reference sequences explain the low-resolution names. We speculate that researchers’ perceived lack of time and lack of insight into the ramifications of this problem are the main explanations for the low-resolution names. We were surprised to find that more than a fifth of these sequences seem to have been deposited by mycologists rather than researchers unfamiliar with the consequences of poorly annotated fungal sequences in molecular repositories. The proportion of these needlessly poorly annotated sequences does not decline over time, suggesting that this problem must not be left unchecked.

Taxonomic annotation

Species identification

Scientific practice

DNA barcoding

Data mining

Data interoperability

Author

Kessy Abarenkov

Tartu Ülikooli loodusmuuseum

Erik Kristiansson

Chalmers, Mathematical Sciences, Applied Mathematics and Statistics

University of Gothenburg

M. Ryberg

Uppsala University

Sandra Nogal-Prata

Real Jardín Botanico

Daniela Gómez-Martínez

University of Gothenburg

Katrin Stüer-Patowsky

Technical University of Munich

Tobias Jansson

University of Gothenburg

Sergei Põlme

Tartu Ülikooli loodusmuuseum

Masoomeh Ghobad-Nejhad

Iranian Research Organization for Science and Technology

Natàlia Corcoll

University of Gothenburg

Ruud Scharn

University of Gothenburg

Marisol Sánchez-García

Swedish University of Agricultural Sciences (SLU)

Maryia Khomich

University of Bergen

Christian Wurzbacher

Technical University of Munich

R. Henrik Nilsson

University of Gothenburg

MycoKeys

1314-4057 (ISSN) 1314-4049 (eISSN)

Vol. 86 177-194

Subject Categories

Biological Systematics

Bioinformatics and Systems Biology

Genetics

DOI

10.3897/MYCOKEYS.86.76053

More information

Latest update

3/24/2022