Ancestry analysis using a self-developed 56 AIM-InDel loci and machine learning methods.

Journal: Forensic science international
PMID:

Abstract

Insertion/deletion (InDel) polymorphisms can be used as one of the ancestry-informative markers in ancestry analysis. In this study, a self-developed panel consisting of 56 ancestry-informative InDels was used to investigate the genetic structures and genetic relationships between Chinese Inner Mongolia Manchu group and 26 reference populations. The Inner Mongolia Manchu group was closely related in genetic background to East Asian populations, especially the Han Chinese in Beijing. Moreover, populations from northern and southern East Asia displayed obvious variations in ancestral components, suggesting the potential value of this panel in distinguishing the populations from northern and southern East Asia. Subsequently, four machine learning models were performed based on the 56 AIM-InDel loci to evaluate the performance of this panel in ancestry prediction. The random forest model presented better performance in ancestry prediction, with 91.87% and 99.73% accuracy for the five and three continental populations, respectively. The individuals of the Inner Mongolia Manchu group were assigned to the East Asian populations by the random forest model, and they exhibited closer genetic affinities with northern East Asian populations. Furthermore, the random forest model distinguished 87.18% of the Inner Mongolia Manchus from the East Asian populations, suggesting that the random forest model based on the 56 ancestry-informative InDels could be a potential tool for ancestry analysis.

Authors

  • Liu Liu
    Department of Oral and Maxillofacial Radiology, School of Dentistry, Dental Science Research Institute, Chonnam National University, Gwangju, South Korea.
  • Shuanglin Li
    Department of Anatomy and Histology, School of Basic Medical Sciences, Shenzhen University Medical School, Shenzhen University, 1066 Xueyuan Avenue, Shenzhen, Guangdong, China.
  • Wei Cui
  • Yating Fang
    Department of Industrial and Systems Engineering, Rutgers University, Piscataway, New Jersey 08854, United States.
  • Shuyan Mei
    Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, 1838 Guangzhou Avenue North, Guangzhou, Guangdong, PR China.
  • Man Chen
    Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing 210014, China.
  • Hui Xu
    No 202 Hospital of People's Liberation Army, Liaoning 110003, China.
  • Xiaole Bai
    Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, 1838 Guangzhou Avenue North, Guangzhou, Guangdong, PR China.
  • Bofeng Zhu
    Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, 1838 Guangzhou Avenue North, Guangzhou, Guangdong, PR China; Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'anJiaotong University, 99 Yanxiang Road, Xi'an, Shaanxi, PR China. Electronic address: zhubofeng7372@126.com.