昆虫学报 ›› 2024, Vol. 67 ›› Issue (2): 183-192.doi: 10.16380/j.kcxb.2024.02.004

• 研究论文 • 上一篇    下一篇

中华蜜蜂工蜂幼虫肠道全长转录组构建与注释

宋宇轩1,#, 李坤泽1,#, 臧贺1, 荆欣1, 范小雪1, 邹培缘1陈大福1,2,3, 付中民1,2,3,*, 郭睿1,2,3,*   

  1. (1. 福建农林大学蜂学与生物医药学院, 福州 350002; 2. 天然生物毒素国家地方联合工程实验室, 福州 350002; 3. 福建省蜂疗研究所, 福州 350002)
  • 出版日期:2024-02-20 发布日期:2024-03-27

Construction and annotation of the full-length transcriptome of the larval gut of Apis cerana cerana (Hymenoptera: Apidae) workers

SONG Yu-Xuan1,#, LI Kun-Ze1,#, ZANG He1, JING Xin1, FAN Xiao-Xue1, ZOU Pei-Yuan1, CHEN Da-Fu 1,2,3, FU Zhong-Min1,2,3,*, GUO Rui1,2,3,*   

  1.  (1. College of Bee Science and Biomedicine, Fujian Agriculture and Forestry University, Fuzhou 350002, China; 2. National & Local United Engineering Laboratory of Natural Biotoxin, Fuzhou 350002, China; 3. Apicultural Research Institute of Fujian Province, Fuzhou 350002, China)
  • Online:2024-02-20 Published:2024-03-27

摘要: 【目的】通过纳米孔(nanopore)测序技术组装和注释中华蜜蜂Apis cerana cerana工蜂幼虫肠道高质量全长转录组。【方法】采用Nanopore PromethION系统对蜜蜂球囊菌Ascosphaera apis接种的中华蜜蜂工蜂3日龄幼虫后的4, 5和6日龄幼虫肠道(分别为AcT4, AcT5和AcT6)进行转录组测序,鉴定全长转录本序列;将前期未接种蜜蜂球囊菌中华蜜蜂工蜂4, 5和6日龄幼虫肠道转录组纳米孔测序数据中鉴定到的全长转录本与上述鉴定到的全长转录本混合后滤除冗余全长转录本;将鉴定到的非冗余全长转录本比对Nr, KOG, eggNOG和GO数据库进行注释。采用CPC, CNCI, CPAT和Pfam 4种方法预测长链非编码RNA (long non-coding RNA, lncRNA)。【结果】AcT4, AcT5和AcT6分别测得14 474 634, 10 461 827和11 890 978条原始读段(raw reads),分别包含11 898 582, 8 630 186 和9 091 035条全长转录本,去冗余后分别鉴定到27 815, 21 781和20 004条非冗余全长转录本,N50长度分别为1 900, 1 961和2 294 bp,平均长度分别为1 534, 1 584和1 792 bp,最长读段长度分别为10 855, 10 837和10 887 bp。鉴定到40 562条去非冗余全长转录本,分别有35 415, 24 646, 34 054和23 053条转录本可分别注释到Nr, KOG, eggNOG和GO数据库。在Nr数据库中注释全长转录本数目和占比最高的物种是东方蜜蜂A. cerana (20 310条, 57.35%),其次为西方蜜蜂A. mellifera (4 686条转录本, 占13.23%)、大蜜蜂A. dorsata (2 536条转录本, 占7.16%)和小蜜蜂A. florea (2 079条转录本, 占5.87%)。非冗余全长转录本可注释到eggNOG数据库中的未知功能及翻译后修饰、蛋白质更新和分子伴侣等25个功能分类、KOG数据库中的仅一般功能预测和信号转导机制等25个功能分类、GO数据库中生物学进程、细胞组分和分子功能三大类中的50个功能条目以及KEGG数据库中核糖体和RNA转运等196条通路。共鉴定到2 301条高可信度lncRNA,涉及正义链lncRNA、反义链lncRNA、内含子lncRNA和基因间区lncRNA 4种类型。【结论】成功构建和注释了中华蜜蜂工蜂幼虫肠道的首个全长转录组,为中华蜜蜂和东方蜜蜂A. cerana其他亚种的分子生物学及组学研究提供了高质量参考背景和关键基础。

关键词:  东方蜜蜂, 中华蜜蜂, 蜜蜂球囊菌, 肠道, 全长转录组, 长链非编码RNA, 第三代测序技术, 纳米孔测序

Abstract: 【Aim】 To assemble and annotate the high-quality full-length transcriptome of the larval gut of Apis cerana cerana workers through nanopore sequencing technology. 【Methods】 The 3-day-old larvae of A. c. cerana workers were inoculated with Ascosphaera apis and the transcriptomes of the gut of the 4-, 5- and 6-day-old larvae (AcT4, AcT5 and AcT6, respectively) were sequenced by Nanopore PromethION system to identify the full-length transcript sequences. The previously identified full-length transcripts in the nanopore sequencing gut transcriptome data of the 4-, 5- and 6-day-old larvae of A. c. cerana workers uninoculated with A. apis were mixed with the above obtained full-length transcripts in this study to remove the redundant full-length transcripts. The identified non-redundant full-length transcripts were aligned to the Nr, KOG, eggNOG and GO databases for annotations. Four methods including CPC, CNCI, CPAT and Pfam were used to predict long non-coding RNAs (lncRNAs). 【Results】 From AcT4, AcT5 and AcT6, 14 474 634, 10 461 827 and 11 890 978 raw reads were obtained, including 11 898 582, 8 630 186 and 9 091 035 full-length transcripts, 27 815, 21 781 and 20 004 non-redundant full-length transcripts were identified after de-redundancy, the N50 lengths of the non-redundant full-length transcripts were 1 900, 1 961 and 2 294 bp with the average lengths of 1 534, 1 584 and 1 792 bp, and the longest read lengths of 10 855, 10 837 and 10 887 bp, respectively. A total of 40 562 non-redundant full-length transcripts were identified, and 35 415, 24 646, 34 054 and 23 053 transcripts could be annotated to the Nr, KOG, eggNOG, and GO databases, respectively. The species with the highest number and proportion of annotated full-length transcripts was A. cerana (20 310 transcripts, accounting for 57.35%), followed by A. mellifera (4 686 transcripts, accounting for 13.23%), A. dorsata (2 536 transcripts, accounting for 7.16%) and A. florea (2 079, accounting for 5.87%) in the Nr database. The non-redundant full-length transcripts were annotated to 25 functional categories such as unknown function and post-translational modification, protein turnover and molecular chaperones in the eggNOG database, 25 functional categories such as general function prediction only and signal transduction mechanisms in the KOG database, 50 functional terms relevant to three major categories of biological process, cellular component and molecular function in the GO database, and 196 pathways such as ribosome and RNA transduction in the KEGG database. A total of 2 301 lncRNAs with high confidence were identified, involving four types including sense lncRNA, anti-sense lncRNA, intronic lncRNA and intergenic lncRNA. 【Conclusion】 The first full-length transcriptome of the larval gut of A. c. cerana workers has been constructed and annotated, providing high-quality reference background and key foundation for study on molecular biology and omics of A. c. cerana and other subspecies of A. cerana.

Key words: Apis cerana, Apis cerana cerana, Ascosphaera apis, gut, full-length transcriptome, long non-coding RNA, 3rd-generation sequencing technology, nanopore sequencing