The curse of the uncultured fungus
Artikel i vetenskaplig tidskrift, 2022

The international DNA sequence databases abound in fungal sequences not annotated beyond the kingdom level, typically bearing names such as “uncultured fungus”. These sequences beget low-resolution mycological results and invite further deposition of similarly poorly annotated entries. What do these sequences represent? This study uses a 767,918-sequence corpus of public full-length fungal ITS sequences to estimate what proportion of the 95,055 “uncultured fungus” sequences that represent truly unidentifiable fungal taxa – and what proportion of them that would have been straightforward to annotate to some more meaningful taxonomic level at the time of sequence deposition. Our results suggest that more than 70% of these sequences would have been trivial to identify to at least the order/family level at the time of sequence deposition, hinting that factors other than poor availability of relevant reference sequences explain the low-resolution names. We speculate that researchers’ perceived lack of time and lack of insight into the ramifications of this problem are the main explanations for the low-resolution names. We were surprised to find that more than a fifth of these sequences seem to have been deposited by mycologists rather than researchers unfamiliar with the consequences of poorly annotated fungal sequences in molecular repositories. The proportion of these needlessly poorly annotated sequences does not decline over time, suggesting that this problem must not be left unchecked.

Taxonomic annotation

Species identification

Scientific practice

DNA barcoding

Data mining

Data interoperability

Författare

Kessy Abarenkov

Tartu Ülikooli loodusmuuseum

Erik Kristiansson

Chalmers, Matematiska vetenskaper, Tillämpad matematik och statistik

Göteborgs universitet

M. Ryberg

Uppsala universitet

Sandra Nogal-Prata

Real Jardín Botanico

Daniela Gómez-Martínez

Göteborgs universitet

Katrin Stüer-Patowsky

Technische Universität München

Tobias Jansson

Göteborgs universitet

Sergei Põlme

Tartu Ülikooli loodusmuuseum

Masoomeh Ghobad-Nejhad

Iranian Research Organization for Science and Technology

Natàlia Corcoll

Göteborgs universitet

Ruud Scharn

Göteborgs universitet

Marisol Sánchez-García

Sveriges lantbruksuniversitet (SLU)

Maryia Khomich

Universitetet i Bergen

Christian Wurzbacher

Technische Universität München

R. Henrik Nilsson

Göteborgs universitet

MycoKeys

1314-4057 (ISSN) 1314-4049 (eISSN)

Vol. 86 177-194

Ämneskategorier

Biologisk systematik

Bioinformatik och systembiologi

Genetik

DOI

10.3897/MYCOKEYS.86.76053

Mer information

Senast uppdaterat

2022-03-24