Autosomal deletion/insertion polymorphisms for global stratification analyses and ancestry origin inferences of different continental populations by machine learning methods.

Journal: Electrophoresis
PMID:

Abstract

A lot of population data of 30 deletion/insertion polymorphisms (DIPs) of the Investigator DIPplex kit in different continental populations have been reported. Here, we assessed genetic distributions of these 30 DIPs in different continental populations to pinpoint candidate ancestry informative DIPs. Besides, the effectiveness of machine learning methods for ancestry analysis was explored. Pairwise informativeness (In) values of 30 DIPs revealed that six loci displayed relatively high In values (>0.1) among different continental populations. Besides, more loci showed high population-specific divergence (PSD) values in African population. Based on the pairwise In and PSD values of 30 DIPs, 17 DIPs in the Investigator DIPplex kit were selected to ancestry analyses of African, European, and East Asian populations. Even though 30 DIPs provided better ancestry resolution of these continental populations based on the results of PCA and population genetic structure, we found that 17 DIPs could also distinguish these continental populations. More importantly, these 17 DIPs possessed more balanced cumulative PSD distributions in these populations. Six machine learning methods were used to perform ancestry analyses of these continental populations based on 17 DIPs. Obtained results revealed that naïve Bayes manifested the greatest performance; whereas, k nearest neighbor showed relatively low performance. To sum up, these machine learning methods, especially for naïve Bayes, could be used as the valuable tool for ancestry analysis.

Authors

  • Xiaoye Jin
    Department of Forensic Medicine, Guizhou Medical University, Guiyang, P. R. China.
  • Yuluo Liu
    Department of Forensic Science, Guangdong Police College, Guangzhou, P. R. China.
  • Yuanyuan Zhang
    National Clinical Research Center for Kidney Disease, State Key Laboratory for Organ Failure Research, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, Guangdong Province, China.
  • Yongle Li
    National Health Commission Key Laboratory of Birth Defects Prevention, Henan Key Laboratory of Population Defects Prevention, Zhengzhou, P. R. China.
  • Chuanliang Chen
    Henan Provincial People's Hospital, Zhengzhou University People's Hospital, Zhengzhou, P. R. China.
  • Hongdan Wang
    Medical Genetics Institute of Henan Province, Henan Provincial People's Hospital,Zhengzhou University People's Hospital, Zhengzhou, P. R. China.