›› 2017, Vol. 60 ›› Issue (11): 1339-1348.doi: 10.16380/j.kcxb.2017.11.012

• RESEARCH PAPERS • Previous Articles     Next Articles

Automatic identification of butterfly specimen images at the family level based on deep learning method

ZHOU Ai-Ming1, MA Peng-Peng1, XI Tian-Yu2, WANG Jiang-Ning2, FENG Jin1, SHAO Ze-Zhong1, TAO Yu-Lei1, YAO Qing1,*   

  1. (1. School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China; 2. Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China)
  • Online:2017-11-20 Published:2017-11-20

Abstract: 【Aim】 This study aims to explore the feasibility and generalization ability of deep learning model applied to the automatic identification of butterfly images at the family level. 【Methods】 To improve the robustness and generalization performance of model, the data augmentation with images of 1 117 butterfly species of six families were performed to increase the number of images by flipping image horizontally, increasing image contrast and brightness, and adding noises for training. In Caffe framework, an ImageNet-trained convolution neural network model was obtained by 310 000 iterations. The training set of butterfly images was used to train a new CaffeNet model to automatically identify butterflies at the family level by the transfer learning method. To compare generalization ability of the CaffeNet model based on deep learning with the models based on traditional pattern recognition methods, global and local features were extracted from the same training samples, and the support vector machine (SVM) classifier was trained. All models were used to detect the two different test sample sets. 【Results】 When the test samples, same as the training samples, were from specimen images, the CaffeNet model had a mean accuracy rate of 95.8%, while the SVM classifier based on Gabor features had a mean accuracy rate of 94.8% in six butterfly families. When the test samples were from natural images of butterflies, the accuracy rates of the CaffeNet and SVM models were decreased. However, the accuracy rate of CaffeNet model still achieved 65.6% and the SVM classifier based on Gabor features only got the 38.9% accuracy rate. 【Conclusion】 The butterfly identification model based on deep learning has a high identification rate at the family level, with higher robustness and generalization ability than those traditional pattern recognition models based on global and local features by manual extraction and selection.

Key words: Butterfly, specimen images, automatic identification, deep learning, CaffeNet model, feature extraction, support vector machine