Jin, L., Seielstad, M. & Xiao, C. Genetic, Linguistic and Archaeological Perspectives on Human Diversity in Southeast Asia (World Scientific, 2001).
Glover, I. & Bellwood, P. S. Southeast Asia: From Prehistory to History (Routledge Curzon, 2004).
Browning, S. R., Browning, B. L., Zhou, Y., Tucci, S. & Akey, J. M. Analysis of human sequence data reveals two pulses of archaic Denisovan admixture. Cell 173, 53–61.e59 (2018).
Su, B. et al. Y-chromosome evidence for a northward migration of modern humans into Eastern Asia during the last Ice Age. Am. J. Hum. Genet. 65, 1718–1724 (1999).
Hallast, P., Agdzhoyan, A., Balanovsky, O., Xue, Y. & Tyler-Smith, C. A Southeast Asian origin for present-day non-African human Y chromosomes. Hum. Genet. 140, 299–307 (2021).
Kutanan, W. et al. Reconstructing the human genetic history of mainland Southeast Asia: insights from genome-wide data from Thailand and Laos. Mol. Biol. Evol. 38, 3459–3477 (2021).
Duong, N. T. et al. Complete human mtDNA genome sequences from Vietnam and the phylogeography of mainland Southeast Asia. Sci. Rep. 8, 11651 (2018).
Li, Y. C. et al. Ancient inland human dispersals from Myanmar into interior East Asia since the Late Pleistocene. Sci. Rep. 5, 9473 (2015).
Zhang, X. et al. Analysis of mitochondrial genome diversity identifies new and ancient maternal lineages in Cambodian aborigines. Nat. Commun. 4, 2599 (2013).
Abdulla, M. A. et al. Mapping human genetic diversity in Asia. Science 326, 1541–1545 (2009).
Peng, M. S. et al. Tracing the Austronesian footprint in mainland Southeast Asia: a perspective from mitochondrial DNA. Mol. Biol. Evol. 27, 2417–2430 (2010).
Deng, L. et al. Genetic connections and convergent evolution of tropical Indigenous peoples in Asia. Mol. Biol. Evol. 39, msab361 (2022).
Tucci, S. et al. Evolutionary history and adaptation of a human pygmy population of Flores Island, Indonesia. Science 361, 511–516 (2018).
Zhang, X. et al. The distinct morphological phenotypes of Southeast Asian aborigines are shaped by novel mechanisms for adaptation to tropical rainforests. Natl Sci. Rev. 9, nwab072 (2022).
Dhir, R. K., Cattaneo, U., Ormaza, M. V. C., Coronado, H. & Oelz, M. Implementing the ILO Indigenous and Tribal Peoples Convention No. 169: Towards an Inclusive, Sustainable and Just Future (International Labour Organization, 2020).
Wong, L. P. et al. Deep whole-genome sequencing of 100 southeast Asian Malays. Am. J. Hum. Genet. 92, 52–66 (2013).
Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell 185, 3426–3440.e3419 (2022).
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
GenomeAsia, K. C. The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature 576, 106–111 (2019).
Wu, D. et al. Large-scale whole-genome sequencing of three diverse Asian populations in Singapore. Cell 179, 736–749.e715 (2019).
McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Logsdon, G. A. et al. Complex genetic variation in nearly complete human genomes. Preprint at bioRxiv https://doi.org/10.1101/2024.09.24.614721 (2024).
Liao, W. W. et al. A draft human pangenome reference. Nature 617, 312–324 (2023).
Wang, C., Zollner, S. & Rosenberg, N. A. A quantitative comparison of the similarity between genes and geography in worldwide human populations. PLoS Genet. 8, e1002886 (2012).
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Lipson, M. et al. Ancient genomes document multiple waves of migration in Southeast Asian prehistory. Science 361, 92–95 (2018).
McColl, H. et al. The prehistoric peopling of Southeast Asia. Science 361, 88–92 (2018).
Liu, D. et al. Extensive ethnolinguistic diversity in Vietnam reflects multiple sources of genetic diversity. Mol. Biol. Evol. 37, 2503–2519 (2020).
Lawson, D. J., van Dorp, L. & Falush, D. A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots. Nat. Commun. 9, 3258 (2018).
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).
Schiffels, S. & Wang, K. MSMC and MSMC2: the multiple sequentially Markovian coalescent. Methods Mol. Biol. 2090, 147–166 (2020).
Terhorst, J., Kamm, J. A. & Song, Y. S. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat. Genet. 49, 303–309 (2017).
Grossman, S. R. et al. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327, 883–886 (2010).
Luo, H. et al. Recent positive selection signatures reveal phenotypic evolution in the Han Chinese population. Sci. Bull. 68, 2391–2404 (2023).
Zheng, W. et al. Large-scale genome sequencing redefines the genetic footprints of high-altitude adaptation in Tibetans. Genome Biol. 24, 73 (2023).
Liu, X. et al. Decoding triancestral origins, archaic introgression, and natural selection in the Japanese population by whole-genome sequencing. Sci. Adv. 10, eadi8419 (2024).
Lo, Y. H. et al. Detecting genetic ancestry and adaptation in the Taiwanese Han people. Mol. Biol. Evol. 38, 4149–4165 (2021).
Chen, L., Wolf, A. B., Fu, W., Li, L. & Akey, J. M. Identifying and interpreting apparent Neanderthal ancestry in African individuals. Cell 180, 677–687.e616 (2020).
Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012).
Springelkamp, H. et al. ARHGEF12 influences the risk of glaucoma by increasing intraocular pressure. Hum. Mol. Genet. 24, 2689–2699 (2015).
Kichaev, G. et al. Leveraging polygenic functional enrichment to improve GWAS power. Am. J. Hum. Genet. 104, 65–75 (2019).
Choudhury, A. et al. High-depth African genomes inform human migration and health. Nature 586, 741–748 (2020).
Efremov, G. D. et al. Hb Icaria–Hb H disease: identification of the Hb Icaria mutation through analysis of amplified DNA. Br. J. Haematol. 75, 250–253 (1990).
Vlok, M. et al. Forager and farmer evolutionary adaptations to malaria evidenced by 7000 years of thalassemia in Southeast Asia. Sci. Rep. 11, 5677 (2021).
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Larena, M. et al. Multiple migrations to the Philippines during the last 50,000 years. Proc. Natl Acad. Sci. USA 118, e2026132118 (2021).
Karmin, M. et al. Episodes of diversification and isolation in island Southeast Asian and near Oceanian male lineages. Mol. Biol. Evol. 39, msac045 (2022).
Fan, S., Hansen, M. E., Lo, Y. & Tishkoff, S. A. Going global by adapting local: a review of recent human adaptation. Science 354, 54–59 (2016).
Sankararaman, S., Mallick, S., Patterson, N. & Reich, D. The combined landscape of Denisovan and Neanderthal ancestry in present-day humans. Curr. Biol. 26, 1241–1247 (2016).
Barnes, R. H., Gray, A. & Kingsbury, B. Indigenous Peoples of Asia (Association for Asian Studies, 1995).
Taylor, P. M. in World Bank Inspection Panel. Investigation Report (March 30, 2006): Cambodia: Forest Concession Management and Control Pilot Project 128–141 (World Bank, 2006).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
Van der Auwera, G. A. & O’Connor, B. D. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra (O’Reilly Media, 2020).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Cingolani, P. et al. Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift. Front. Genet. 3, 35 (2012).
Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
Mondal, M. et al. Genomic analysis of Andamanese provides insights into ancient human migration into Asia and adaptation. Nat. Genet. 48, 1066–1070 (2016).
Lu, D. et al. Ancestral origins and genetic history of Tibetan highlanders. Am. J. Hum. Genet. 99, 580–594 (2016).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Wang, C. et al. Comparing spatial maps of human population-genetic variation using Procrustes analysis. Stat. Appl. Genet. Mol. Biol. 9, 13 (2010).
Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 8, e1002967 (2012).
Bhatia, G., Patterson, N., Sankararaman, S. & Price, A. L. Estimating and interpreting FST: the impact of rare variants. Genome Res. 23, 1514–1521 (2013).
Tamura, K., Stecher, G. & Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis version 11. Mol. Biol. Evol. 38, 3022–3027 (2021).
Yu, G. Using ggtree to visualize data on tree-like structures. Curr. Protoc. Bioinformatics 69, e96 (2020).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Scrucca, L., Fop, M., Murphy, T. B. & Raftery, A. E. mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8, 289–317 (2016).
Zhang, C., Dong, S. S., Xu, J. Y., He, W. M. & Yang, T. L. PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics 35, 1786–1788 (2019).
Wang, J., Raskin, L., Samuels, D. C., Shyr, Y. & Guo, Y. Genome measures used for quality control are dependent on gene function and ancestry. Bioinformatics 31, 318–323 (2015).
Weissensteiner, H. et al. HaploGrep 2: mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic Acids Res. 44, W58–W63 (2016).
van Oven, M. & Kayser, M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum. Mutat. 30, E386–E394 (2009).
Chen, H., Lu, Y., Lu, D. & Xu, S. Y-LineageTracker: a high-throughput analysis framework for Y-chromosomal next-generation sequencing data. BMC Bioinformatics 22, 114 (2021).
Y Chromosome Consortium. A nomenclature system for the tree of human Y-chromosomal binary haplogroups. Genome Res. 12, 339–348 (2002).
Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).
Sellinger, T. P. P., Abu-Awad, D. & Tellier, A. Limits and convergence properties of the sequentially Markovian coalescent. Mol. Ecol. Resour. 21, 2231–2248 (2021).
Patton, A. H. et al. Contemporary demographic reconstruction methods are robust to genome assembly quality: a case study in Tasmanian devils. Mol. Biol. Evol. 36, 2906–2921 (2019).
Hu, W. et al. Genomic inference of a severe human bottleneck during the Early to Middle Pleistocene transition. Science 381, 979–984 (2023).
Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).
Sabeti, P. C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007).
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
Szpiech, Z. A. & Hernandez, R. D. selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol. Biol. Evol. 31, 2824–2827 (2014).
Zerbino, D. R., Wilder, S. P., Johnson, N., Juettemann, T. & Flicek, P. R. The ensembl regulatory build. Genome Biol. 16, 56 (2015).
The Encode Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Romanoski, C. E., Glass, C. K., Stunnenberg, H. G., Wilson, L. & Almouzni, G. Epigenomics: roadmap for regulation. Nature 518, 314–316 (2015).
Adams, D. et al. BLUEPRINT to decode the epigenetic signature written in blood. Nat. Biotechnol. 30, 224–226 (2012).
Turchin, M. C. et al. Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nat. Genet. 44, 1015–1019 (2012).
Racimo, F., Berg, J. J. & Pickrell, J. K. Detecting polygenic adaptation in admixture graphs. Genetics 208, 1565–1584 (2018).
Berg, J. J. & Coop, G. A population genetic signal of polygenic adaptation. PLoS Genet. 10, e1004412 (2014).
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).
Chen, M. et al. Evidence of polygenic adaptation in Sardinia at height-associated loci ascertained from the Biobank Japan. Am. J. Hum. Genet. 107, 60–71 (2020).
Barrett, J. C., Fry, B., Maller, J. & Daly, M. J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).
Hofmeister, R. J., Ribeiro, D. M., Rubinacci, S. & Delaneau, O. Accurate rare variant phasing of whole-genome and whole-exome sequencing data in the UK Biobank. Nat. Genet. 55, 1243–1249 (2023).
Rozas, J. et al. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol. Biol. Evol. 34, 3299–3302 (2017).
Excoffier, L. & Lischer, H. E. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol. Ecol. Resour. 10, 564–567 (2010).
Leigh, W. J. & Bryant, D. POPART: full-feature software for haplotype network construction. Methods Ecol. Evol. 6, 1110–1116 (2015).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
Hu, J. et al. NextPolish2: A repeat-aware polishing tool for genomes assembled using HiFi long reads. Genomics Proteomics Bioinformatics 22, qzad009 (2024).
Chen, Y., Zhang, Y., Wang, A. Y., Gao, M. & Chong, Z. Accurate long-read de novo assembly evaluation with Inspector. Genome Biol. 22, 312 (2021).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Smolka, M. et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat. Biotechnol. 42, 1571–1580 (2024).
Ebert, P. et al. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372, eabf7117 (2021).
Wang, S. et al. De novo and somatic structural variant discovery with SVision-pro. Nat. Biotechnol. 43, 181–185 (2025).
Kirsche, M. et al. Jasmine and Iris: population-scale structural variant comparison and analysis. Nat. Methods 20, 408–417 (2023).
Chen, S. et al. Paragraph: a graph-based structural variant genotyper for short-read sequence data. Genome Biol. 20, 291 (2019).
Prufer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).
Prufer, K. et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science 358, 655–658 (2017).
Delaneau, O., Zagury, J. F., Robinson, M. R., Marchini, J. L. & Dermitzakis, E. T. Accurate, scalable and integrative haplotype estimation. Nat. Commun. 10, 5436 (2019).
Loh, P. R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
Huang, J. et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat. Commun. 6, 8111 (2015).