昆虫学报 ›› 2016, Vol. 59 ›› Issue (6): 622-631.doi: 10.16380/j.kcxb.2016.06.005

• 研究论文 • 上一篇    下一篇

中华按蚊CPF家族表皮蛋白基因的全基因组鉴定及其特征分析

刘柏琦, 乔梁, 许柏英, 郑学令, 陈斌*   

  1. (重庆师范大学生命科学学院, 昆虫与分子生物学研究所, 重庆401331)
  • 出版日期:2016-06-20 发布日期:2016-06-20

Identification and characterization of the CPF family of cuticular protein genes in the genome of Anopheles sinensis (Diptera: Culicidae)

LIU Bai-Qi, QIAO Liang, XU Bo-Ying, ZHENG Xue-Ling, CHEN Bin*   

  1. (Institute of Entomology and Molecular Biology, College of Life Sciences, Chongqing Normal University, Chongqing 401331, China)
  • Online:2016-06-20 Published:2016-06-20

摘要: 【目的】鉴定中华按蚊Anopheles sinensis基因组上的CPF家族表皮蛋白基因,分析其基因结构和特征,推测其可能的生物学功能;同时比较研究代表性蚊种的CPF家族基因,提供CPF家族基因的信息框架。【方法】基于中华按蚊An. sinensis、冈比亚按蚊An. gambiae、微小按蚊An. minimus、埃及伊蚊Aedes aegypti、致倦库蚊Culex quinquefasciatus和黑腹果蝇Drosophila melanogaster全基因组序列,以冈比亚按蚊CPF家族基因序列为询问序列,采用BLASTP,TBLASTN和HMM方法鉴定这些物种的CPF家族基因;利用生物信息学方法预测中华按蚊CPF家族基因的结构、剪切模式、信号肽、跨膜区、结构域和3D结构等;采用最大似然法(maximum likelihood, ML)构建这些物种的系统发生关系,推断CPF家族基因的起源和进化。【结果】中华按蚊、冈比亚按蚊、微小按蚊、埃及伊蚊、致倦库蚊和黑腹果蝇全基因组共有4, 4, 4, 3, 3和3个CPF家族基因。中华按蚊的CPF基因被分别命名为AsCPF1AsCPF2,AsCPF3AsCPF4,这些AsCPF基因的全长cDNA序列分别为736,2 021,531和1 001 bp,分别编码219,345,148和185个氨基酸。AsCPF1,AsCPF2AsCPF3仅含有一个内含子,但AsCPF4含有3个内含子,所有内含子均为0位内含子。AsCPF1, AsCPF2, AsCPF3AsCPF4分别有3, 2, 1和2个不同的选择性剪切子。AsCPF3的表达量最高,其次是AsCPF4,AsCPF2AsCPF1。推测的AsCPF1,AsCPF2,AsCPF3和AsCPF4的理论分子量分别为22.86,36.47,15.08和18.66 kD,等电点分别为9.08,8.97,9.44和9.16。AsCPF家族蛋白含有保守的44个氨基酸基序和C-末端基序;AsCPF1, AsCPF3和AsCPF4具有信号肽,为分泌型蛋白,而AsCPF2缺乏信号肽,为非分泌蛋白。二级结构分析显示,4个AsCPF均具有α-螺旋,无规卷曲和延伸链,只有AsCPF4有一段跨膜片段,位于第5-27位氨基酸。系统发育分析显示,CPF3基因可能是最早分化出来的CPF家族基因,CPF1CPF2基因可能是同一祖先基因经过一个基因重复事件分化形成的,CPF4基因很可能是按蚊所特有的,是最晚分化出来的CPF基因。以冈比亚按蚊为对照,替换率分析显示,中华按蚊CPF表皮蛋白的Ka/Ks值均小于1,表现出纯化选择。【结论】对中华按蚊CPF家族基因在全基因组上的鉴定和特征分析,及对代表性蚊虫CPF家族基因的比较分析,揭示了蚊虫CPF家族基因的多样性、结构和氨基酸特征以及起源和进化,这为该家族基因的进一步研究和利用提供了信息基础。

关键词: 中华按蚊, 表皮蛋白, CPF家族, 保守基序, 进化

Abstract: 【Aim】 This study aims to identify the CPF family (CPFs) of cuticular protein genes in Anopheles sinensis genome, to analyze their structure and characteristics, to deduce their possible biological functions, and to investigate and compare the CPFs of representative mosquito species so as to provide information frame for the family of genes. 【Methods】 We identified the CPFs in the genomes of An. sinensis, An. gambiae, An. minimus, Aedes aegypti, Culex quinquefasciatus and Drosophila melanogaster using BLASTP, TBLASTN and HMM with An. gambiae CPFs as query, predicted the structure and splicing variation of An. sinensis CPF gene and the signal peptide, transmembrane region, structural domain and 3D structure of An. sinensis CPF proteins using bioinformatics techniques, and constructed phylogenetic relationships using maximum likelihood (ML) method and deduced the origin and evolution of CPFs in these species. 【Results】 There are 4, 4, 4, 3, 3 and 3 CPFs in An. sinensis, An. gambiae, An. minimus, Ae. aegypti, Cx. quinquefasciatus and Dr. melanogaster genomes, respectively. The CPFs in An. sinensis were named as AsCPF1, AsCPF2, AsCPF3 and AsCPF4, respectively. Their full-length cDNA sequences are 736, 2 021, 531, and 1 001 bp, respectively, encoding 219, 345, 148 and 185 amino acids, respectively. AsCPF1, AsCPF2 and AsCPF3 only have one intron, but AsCPF4 contains three introns, which all have phase “0”. There are 3, 2, 1 and 2 selective spicing variants for AsCPF1, AsCPF2, AsCPF3 and AsCPF4, respectively. AsCPF3 has the highest expression quantity, followed by AsCPF4, AsCPF2 and AsCPF1. The theoretical molecular weights of AsCPF1, AsCPF2, AsCPF3 and AsCPF4 are 22.86, 36.47, 15.08 and 18.66 kD, and their isoelectric points are 9.08, 8.97, 9.44 and 9.16, respectively. These AsCPFs contain a 44-amino-acid conserved region and C-terminal region, and all are secretory proteins with signal peptide sequences except for AsCPF2 that is non-secretory protein and lacks a signal peptide sequence. All the four AsCPFs have alpha helix, random coil and extended strand, and only AsCPF4 has a transmembrane region that is located between amino acid 5 to 27. Phylogenetic analysis showed that CPF3 might be the earliest derived CPF gene, CPF1 and CPF2 might originate from a common ancestor and consequently experienced a gene duplication event, and CPF4 might be unique for Anopheles mosquitoes and the latest derived CPF gene. The Ka/Ks ratio of CPFs are all less than 1 in An. sinensis in reference to An. gambiae, suggesting the purification selection of these genes in evolution. 【Conclusion】 The whole-genome identification and characteristics analysis of CPFs in An. sinensis and the comparison of CPFs in representative mosquito species revealed the diversity, structure and amino acid characteristics and the origin and evolution of the CPF family of genes in mosquitoes, which provides a comprehensive information frame for further research and utilization of the CPF gene family.

Key words: Anopheles sinensis, cuticular protein, CPF family, conservative motif, evolution