Mining metadata from unidentified ITS sequences in GenBank: a case study in Inocybe (Basidiomycota)

Martin Ryberg; R. Henrik Nilsson; Erik Kristiansson; Mats H. Töpel; Stig Jacobsson; Ellen Larsson

doi:10.1186/1471-2148-8-50

Mining metadata from unidentified ITS sequences in GenBank: a case study in Inocybe (Basidiomycota)
Artikel i vetenskaplig tidskrift, 2008

Background The lack of reference sequences from well-identified mycorrhizal fungi often poses a challenge to the inference of taxonomic affiliation of sequences from environmental samples, and many environmental sequences are thus left unidentified. Such unidentified sequences belonging to the widely distributed ectomycorrhizal fungal genus Inocybe (Basidiomycota) were retrieved from GenBank and divided into species that were identified in a phylogenetic context using a reference dataset from an ongoing study of the genus. The sequence metadata of the unidentified Inocybe sequences stored in GenBank, as well as data from the corresponding original papers, were compiled and used to explore the ecology and distribution of the genus. In addition, the relative occurrence of Inocybe was contrasted to that of other mycorrhizal genera. Results Most species of Inocybe were found to have less than 3% intraspecific variability in the ITS2 region of the nuclear ribosomal DNA. This cut-off value was used jointly with phylogenetic analysis to delimit and identify unidentified Inocybe sequences to species level. A total of 177 unidentified Inocybe ITS sequences corresponding to 98 species were recovered, 32% of which were successfully identified to species level in this study. These sequences account for an unexpectedly large proportion of the publicly available unidentified fungal ITS sequences when compared with other mycorrhizal genera. Eight Inocybe species were reported from multiple hosts and some even from hosts forming arbutoid or orchid mycorrhizae. Furthermore, Inocybe sequences have been reported from four continents and in climate zones ranging from cold temperate to equatorial climate. Out of the 19 species found in more than one study, six were found in both Europe and North America and one was found in both Europe and Japan, indicating that at least many north temperate species have a wide distribution. Conclusions Although DNA-based species identification and circumscription are associated with practical and conceptual difficulties, they also offer new possibilities and avenues for research. Metadata assembly holds great potential to synthesize valuable information from community studies for use in a species and taxonomy-oriented framework.

Inocybe

large data sets

data mining

Mycorrhiza

environmental samples

Författare

Martin Ryberg

Göteborgs universitet

R. Henrik Nilsson

Göteborgs universitet

Erik Kristiansson

Chalmers, Matematiska vetenskaper, Matematisk statistik

Göteborgs universitet

Forskning Andra publikationer

Mats H. Töpel

Göteborgs universitet

Stig Jacobsson

Göteborgs universitet

Ellen Larsson

Göteborgs universitet

BMC Evolutionary Biology

14712148 (eISSN)

Vol. 8 50 50

Ämneskategorier (SSIF 2011)

Biologisk systematik

Ekologi

Bioinformatik och systembiologi

DOI

10.1186/1471-2148-8-50

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2022-04-05

Mining metadata from unidentified ITS sequences in GenBank: a case study in Inocybe (Basidiomycota) Artikel i vetenskaplig tidskrift, 2008

Författare

Martin Ryberg

R. Henrik Nilsson

Erik Kristiansson

Mats H. Töpel

Stig Jacobsson

Ellen Larsson

BMC Evolutionary Biology

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

Mining metadata from unidentified ITS sequences in GenBank: a case study in Inocybe (Basidiomycota)
Artikel i vetenskaplig tidskrift, 2008