Acta Entomologica Sinica ›› 2024, Vol. 67 ›› Issue (9): 1251-1261.doi: 10.16380/j.kcxb.2024.09.009

• RESEARCH PAPERS • Previous Articles     Next Articles

Butterfly recognition based on Local-Global-VIT fine-grained classification algorithm

LI Jian-Xiang1, LI Xiao-Lin1, WANG Rong2, ZHANG Yuan-Zi1, CHEN Shu-Wu1, ZHANG Fei-Ping2,3, HUANG Shi-Guo1,3,*   

  1. (1. College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China; 2. College of Forestry, Fujian Agriculture and Forestry University, Fuzhou 350002, China; 3. Key Laboratory of Integrated Pest Management in Ecological Forests, Fujian Province University, Fuzhou 350002, China)
  • Online:2024-09-20 Published:2024-10-22

Abstract: 【Aim】 Identifying butterfly species accurately and monitoring changes in butterfly community diversity dynamically play a significant role in habitat quality assessment and ecological environment restoration. This study aims to develop a Local-Global-VIT fine-grained classification algorithm-based method for butterfly recognition to address the limitation of existing butterfly recognition methods by relying solely on global features but overlooking local features, consequently, leading to inadequate recognition of ecological images. 【Methods】 A dataset of 25 279 butterfly images from 200 species across five families for recognition was used. Various data augmentation techniques were employed to expand the image data. By utilizing the hierarchical structure and self-attention mechanism of vision transformer (VIT), the method selected local tokens layer by layer and retains them until the final layer learned the discriminative local features of butterflies. High-level global tokens were aggregated to mitigate interference from complex backgrounds. Contrastive loss was optimized to widen the inter-class gap and improve differentiation. Additionally, a reasonable learning rate adjustment strategy and transfer learning methods were applied to optimize the model’s convergence process, thereby improving performance without increasing the number of parameters. 【Results】 The recognition accuracy of the Local-Global-VIT algorithm reached 91.20% on the extensive fine-grained Butterfly-200 public dataset, which represented an improvement of 1.15% over previous methods. Therefore, the accuracy of the Local-Global-VIT algorithm exhibited an enhancement by 1.83% and 0.64%, respectively, and its F1-scores increased by 1.89% and 0.88%, respectively, in comparison to the state-of-the-art general pest recognition algorithm EfficientNet_b0 and the fine-grained classification algorithm, TransFG. 【Conclusion】 The Local-Global-VIT algorithm effectively addresses the challenge of distinguishing between significantly different intra-class characteristics and subtle inter-class differences in butterflies through fine-grained recognition, and can accurately identifies various butterfly species, thus contributing to the efficient habitat quality assessment.

Key words: Butterfly, image recognition, finegrained classification, vision transformer, local tokens selection, global tokens aggregation