Acta Entomologica Sinica ›› 2016, Vol. 59 ›› Issue (10): 1058-1068.doi: 10.16380/j.kcxb.2016.10.004

Identification, characteristics and distribution of microsatellites in the whole genome of Anopheles sinensis (Diptera: Culicidae)

WANG Xiao-Ting, ZHANG Yu-Juan, HE Xiu, MEI Ting, CHEN Bin*   

  1. (Institute of Entomology and Molecular Biology, College of Life Sciences, Chongqing Normal University, Chongqing 401331, China)
  • Online:2016-10-20 Published:2016-10-20

Abstract: 【Aim】 Anopheles sinensis is an important malaria vector in China and southeastern Asia. The study aims to identify and analyze the simple sequence repeats (SSRs, also called as microsatellites) and to annotate the functions of SSR-containing genes in the whole genome of An. sinensis, so as to provide the basis for the selection of molecular genetic markers in An. sinensis and to lay a foundation for further studies of the comparative genomics of SSRs in insects. 【Methods】 MISA program was used to identify SSRs in the An. sinensis genome, and Excel 2010 was used to count the length of SSRs identified. Perl scripts were written in the study to calculate the SSRs base content based on the SSR sequence and to map the SSRs to the genome based on the SSR location information from SSR identification. WEGO was used to carry on the GO function annotation of SSR-containing genes in An. sinensis and An. gambiae. 【Results】 A total of 105 981 SSRs were identified in the An. sinensis genome, with the genomic density of 365.5 SSRs per Mb. Out of these SSRs, 100 391 (occupying 94.7%) are perfect SSRs, and the remaining 5 590 (5.3%) are compound SSRs. The mononucleotide SSRs (58 837, 55.5%) are the most abundant, followed by dinucleotide SSRs (30 345, 28.6%), trinucleotide SSRs (15 104, 14.3%), tetranucleotide SSRs (1 530, 1.4%), penanucleotide SSRs (121, 0.1%) and hexanucletide SSRs (44, less than 0.1%). The (A)n SSRs are the most predominant, followed by (AC)n, (AG)n, (C)n, (AGC)n, (ATC)n, (ACG)n and (ACC)n, and each of these types has more than 2 000 SSRs. The SSRs of 10-20 bp length occupy 87.1% of the total. Except that the GC-content (53%) of trinucleotide SSRs is slightly higher than their AT-content, the AT-content (63%) of other SSRs is obviously higher than the GC-content (37%). And 90 632 (85%) SSRs are located in the intergenic region, and 15 349 (15%) SSRs in the gene region, of which 2 782 (3%) SSRs are in the exon region and 12 567 (12%) SSRs in the intron region. The comparison of GO functional annotation of SSR-containing genes in An. sinensis and An. gambiae showed that the percentages of all subcategories of genes are basically similar with each other; the percentage of electron carrier genes in An. sisensis (occupying 0.9%), however, is significantly higher than that in An. gambiae (0.1%). 【Conclusion】 This is the first systematical research of SSRs in whole genomes of mosquito species. The study lays an important basis for the selection of SSRs as biomarkers to carry on the studies of population genetics, genetic variation, the genetic location and regulation mechanisms of functional genes, as well as the studies on the diversity and evolution of SSRs in insects.

Key words: Anopheles sinensis, genome, microsatellites, characteristics, distribution, base content, GO annotation