Taxonomical evaluation of plant genetic markers by Bayesian Classifier

dc.contributor.authorLuisa Matiz-Céron
dc.contributor.authorAlejandro Reyes
dc.contributor.authorJuan Manuel Anzola
dc.coverage.spatialBolivia
dc.date.accessioned2026-03-22T20:46:17Z
dc.date.available2026-03-22T20:46:17Z
dc.date.issued2020
dc.description.abstractDNA barcodes are standardized sequences that range between 400-800 bp, vary at different taxonomic levels, and make it possible to identify individuals of species that have been previously assigned taxonomically. Several barcodes have been identified in different groups in the tree of life. However, there are groups that lack an accurate DNA marker, and even more so, accurate strategies that enable verification of their taxonomic affiliation. Several DNA barcodes have been postulated for plants, nonetheless, their classification potential has not been evaluated for metabarcoding, and as a result, it would appear as no one of them excels above the others in this area. One tool that has recently gained traction is Naïve Bayesian Classifiers; this type of classifier is based on the independence of attributes and the allocation of categories in each context. The present study aims at evaluating the classification power of several plant genetic markers that have been proposed as barcodes ( trnL, rpoB, rbcL, matK, psbA-trnH and psbK ) using a Naïve Bayesian Classifier, in order to determine the markers with higher performance at different taxonomic levels for metabarcoding analysis and to identify problematic genera at the time of species assignment. We propose matK and trnL as potential candidates up to the genus assignment. Some problematic genera ( Aegilops, Gueldenstaedtia, Helianthus, Oryza, Shorea, Thysananthus and Triticum ) within certain families in a sample could lead to misclassification no matter which marker is used. Finally, we propose recommendations when performing taxonomic identification analysis of plants in samples with multiple individuals.
dc.identifier.doi10.22541/au.160648578.83917620/v1
dc.identifier.urihttps://doi.org/10.22541/au.160648578.83917620/v1
dc.identifier.urihttps://andeanlibrary.org/handle/123456789/83971
dc.language.isoen
dc.sourceUniversidad de Los Andes
dc.subjectBiology
dc.subjectDNA barcoding
dc.subjectClassifier (UML)
dc.subjectBayesian probability
dc.subjectPhylogenetic tree
dc.subjectEvolutionary biology
dc.subjectArtificial intelligence
dc.subjectContext (archaeology)
dc.subjectMachine learning
dc.subjectPattern recognition (psychology)
dc.titleTaxonomical evaluation of plant genetic markers by Bayesian Classifier
dc.typepreprint

Files