Acta Entomologica Sinica ›› 2024, Vol. 67 ›› Issue (2): 183-192.doi: 10.16380/j.kcxb.2024.02.004

Construction and annotation of the full-length transcriptome of the larval gut of Apis cerana cerana (Hymenoptera: Apidae) workers

SONG Yu-Xuan1,#, LI Kun-Ze1,#, ZANG He1, JING Xin1, FAN Xiao-Xue1, ZOU Pei-Yuan1, CHEN Da-Fu 1,2,3, FU Zhong-Min1,2,3,*, GUO Rui1,2,3,*   

  1.  (1. College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; 2. National & Local United Engineering Laboratory of Natural Biotoxin, Fuzhou 350002, China; 3. Apicultural Research Institute of Fujian Province, Fuzhou 350002, China)
  • Online:2024-02-20 Published:2024-03-27

Abstract: 【Aim】 To assemble and annotate the high-quality full-length transcriptome of the larval gut of Apis cerana cerana workers through nanopore sequencing technology. 【Methods】 The 3-day-old larvae of A. c. cerana workers were inoculated with Ascosphaera apis and the transcriptomes of the gut of the 4-, 5- and 6-day-old larvae (AcT4, AcT5 and AcT6, respectively) were sequenced by Nanopore PromethION system to identify the full-length transcript sequences. The previously identified full-length transcripts in the nanopore sequencing gut transcriptome data of the 4-, 5- and 6-day-old larvae of A. c. cerana workers uninoculated with A. apis were mixed with the above obtained full-length transcripts in this study to remove the redundant full-length transcripts. The identified non-redundant full-length transcripts were aligned to the Nr, KOG, eggNOG and GO databases for annotations. Four methods including CPC, CNCI, CPAT and Pfam were used to predict long non-coding RNAs (lncRNAs). 【Results】 From AcT4, AcT5 and AcT6, 14 474 634, 10 461 827 and 11 890 978 raw reads were obtained, including 11 898 582, 8 630 186 and 9 091 035 full-length transcripts, 27 815, 21 781 and 20 004 non-redundant full-length transcripts were identified after de-redundancy, the N50 lengths of the non-redundant full-length transcripts were 1 900, 1 961 and 2 294 bp with the average lengths of 1 534, 1 584 and 1 792 bp, and the longest read lengths of 10 855, 10 837 and 10 887 bp, respectively. A total of 40 562 non-redundant full-length transcripts were identified, and 35 415, 24 646, 34 054 and 23 053 transcripts could be annotated to the Nr, KOG, eggNOG, and GO databases, respectively. The species with the highest number and proportion of annotated full-length transcripts was A. cerana (20 310 transcripts, accounting for 57.35%), followed by A. mellifera (4 686 transcripts, accounting for 13.23%), A. dorsata (2 536 transcripts, accounting for 7.16%) and A. florea (2 079, accounting for 5.87%) in the Nr database. The non-redundant full-length transcripts were annotated to 25 functional categories such as unknown function and post-translational modification, protein turnover and molecular chaperones in the eggNOG database, 25 functional categories such as general function prediction only and signal transduction mechanisms in the KOG database, 50 functional terms relevant to three major categories of biological process, cellular component and molecular function in the GO database, and 196 pathways such as ribosome and RNA transduction in the KEGG database. A total of 2 301 lncRNAs with high confidence were identified, involving four types including sense lncRNA, anti-sense lncRNA, intronic lncRNA and intergenic lncRNA. 【Conclusion】 The first full-length transcriptome of the larval gut of A. c. cerana workers has been constructed and annotated, providing high-quality reference background and key foundation for study on molecular biology and omics of A. c. cerana and other subspecies of A. cerana.

Key words: Apis cerana, Apis cerana cerana, Ascosphaera apis, gut, full-length transcriptome, long non-coding RNA, 3rd-generation sequencing technology, nanopore sequencing