A new linear combination method of haplogroup distribution central vectors to model population admixtures.

Journal: Molecular genetics and genomics : MGG
PMID:

Abstract

We introduce a novel population genetic approach suitable to model the origin and relationships of populations, using new computation methods analyzing Hg frequency distributions. Hgs were selected into groups which show correlated frequencies in subsets of populations, based on the assumption that correlations were established in ancient separation, migration and admixture processes. Populations are defined with this universal Hg database, then using unsupervised artificial intelligence, central vectors (CVs) are determined from local condensations of the Hg-distribution vectors in the multidimensional point system. Populations are clustered according to their proximity to CVs. We show that CVs can be regarded as approximations of ancient populations and real populations can be modeled as weighted linear combinations of the CVs using a new linear combination algorithm based on a gradient search for the weights. The efficacy of the method is demonstrated by comparing Copper Age populations of the Carpathian Basin to Middle Age ones and modern Hungarians. Our analysis reveals significant population continuity since the Middle Ages, and the presence of a substrate component since the Copper Age.

Authors

  • Tibor Török
    Department of Genetics, University of Szeged, Szeged, Hungary.
  • Kitti Maár
    Department of Genetics, University of Szeged, Szeged, Hungary.
  • Istvan Gergely Varga
    Institute of Genetics, Biological Research Center (BRC), Szeged, Hungary.
  • Zoltán Juhász
    Institute of Technical Physics and Materials Science, Centre for Energy Research, Budapest, Hungary. juhasz@mfa.kfki.hu.