Herbariums around the world have large collections of sample sheets with dried plants pressed ready for taxonomists to annotate, classify. This process often takes time and many herbariums don’t have the resources to catalogue the sample sheets, however researchers at Costa Rica Institute of Technology recently undertook a study using deep learning to analyze a big dataset with thousands of species from herbaria to see if they could setup a full autonomous to help identify the thousands of plants in collections around the world.
By using convolutional neural network(CNN) and various datasets from the iDigBio portal and other sources (Costa Rica & France) they trained the CNN to learn discriminant visual features of the plants from thousands of herbarium sample sheets. They found that they”….could potentially lead to the creation of a semi, or even fully, automatic system to help taxonomists and experts do their annotation, classification, and revision work at herbarium.”. The researchers also found that the learning could be transferred between regions when they tested a dataset from Costa Rica against another dataset from France. Also that to improve the learning and classification it would be best to remove the handwritten tags, barcodes, logos and other markings on the sample sheets. During the research they also found that the learning does not transfer across to field images of trees, leaves, flowers, but it is best used for herbarium sample sheets.