Lax, G. et al. Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes. Nature 564, 410–414 (2018).
Brown, M. W. et al. Phylogenomics places orphan protistan lineages in a novel eukaryotic super-group. Genome Biol. Evol. 10, 427–433 (2018).
Tikhonenkov, D. V. et al. Microbial predators form a new supergroup of eukaryotes. Nature 612, 714–719 (2022).
Janouškovec, J. et al. A new lineage of eukaryotes illuminates early mitochondrial genome reduction. Curr. Biol. 27, 3717–3724 (2017).
Gawryluk, R. M. R. et al. Non-photosynthetic predators are sister to red algae. Nature 572, 240–243 (2019).
Schön, M. E. et al. Single cell genomics reveals plastid-lacking Picozoa are close relatives of red algae. Nat. Commun. 12, 6651 (2021).
Gray, M. W. et al. The draft nuclear genome sequence and predicted mitochondrial proteome of Andalucia godoyi, a protist with the most gene-rich and bacteria-like mitochondrial genome. BMC Biol. 18, 22 (2020).
Horváthová, L., et al. Analysis of diverse eukaryotes suggests the existence of an ancestral mitochondrial apparatus derived from the bacterial type II secretion system. Nat. Commun. 12, 2947 (2021).
Burger, G., Gray, M. W., Forget, L. & Lang, B. F. Strikingly bacteria-like and gene-rich mitochondrial genomes throughout jakobid protists. Genome Biol. Evol. 5, 418–438 (2013).
Moreira, D., Blaz, J., Kim, E. & Eme, L. A gene-rich mitochondrion with a unique ancestral protein transport system. Curr. Biol. 34, 3812–3819 (2024).
Burki, F., Roger, A. J., Brown, M. W. & Simpson, A. G. B. The new tree of eukaryotes. Trends Ecol. Evol. 35, 43–55 (2020).
Lukeš, J., Čepička, I. & Kolísko, M. Evolution: no end in sight for novel incredible (heterotrophic) protists. Curr. Biol. 34, R55–R58 (2024).
Sunagawa, S. et al. Tara Oceans: towards global ocean ecosystems biology. Nat. Rev. Microbiol. 18, 428–445 (2020).
del Campo, J. et al. The protist cultural renaissance. Trends Microbiol. 32, 128–131 (2024).
Timmis, J. N., Ayliff, M. A., Huang, C. Y. & Martin, W. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nat. Rev. Genet. 5, 123–135 (2004).
Gabaldón, T. & Huynen, M. A. Reconstruction of the proto-mitochondrial metabolism. Science 301, 609 (2003).
Gawryluk, R. M. R. & Stairs, C. W. Diversity of electron transport chains in anaerobic protists. Biochim. Biophys. Acta Bioenerg. 1862, 148334 (2021).
Namasivayam, S. et al. Massive invasion of organellar DNA drives nuclear genome evolution in Toxoplasma. Proc. Natl Acad. Sci. USA 120, e2308569120 (2023).
He, D., Fu, C.-J. & Baldauf, S. L. Multiple origins of eukaryotic cox15 suggest horizontal gene transfer from bacteria to jakobid mitochondrial DNA. Mol. Biol. Evol. 33, 122–133 (2016).
Milner, D. S., Wideman, J. G., Stairs, C. W., Dunn, C. D. & Richards, T. A. A functional bacteria-derived restriction modification system in the mitochondrion of a heterotrophic protist. PLoS Biol. 19, e3001126 (2021).
Gray, M. W. Mosaic nature of the mitochondrial proteome: Implications for the origin and evolution of mitochondria. Proc. Natl Acad. Sci. USA 112, 10133–10138 (2015).
Pyrih, J. et al. Vestiges of the bacterial signal recognition particle-based protein targeting in mitochondria. Mol. Biol. Evol. 38, 3170–3187 (2021).
Eglit, Y. et al. Meteora sporadica, a protist with incredible cell architecture, is related to Hemimastigophora. Curr. Biol. 34, 451–459 (2024).
Shiryev, S. A. & Agarwala, R. Indexing and searching petabase-scale nucleotide resources. Nat. Methods 21, 994–1002 (2024).
Lynch, M. D. J. & Neufeld, J. D. Ecology and exploration of the rare biosphere. Nat. Rev. Microbiol. 13, 217–229 (2015).
Forster, D. et al. Benthic protists: the under-charted majority. FEMS Microbiol. Ecol. 92, fiw120 (2016).
Hausmann, K. Extrusive organelles in protists. Int. Rev. Cytol. 52, 197–276 (1978).
Tice, A. K., et al. PhyloFisher: a phylogenomic package for resolving eukaryotic relationships. PLoS Biol. 19, e3001365 (2021).
Banos, H. et al. GTRpmix: a linked general time-reversible model for profile mixture models. Mol. Biol. Evol. 41, msae174 (2024).
Si Quang, L., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).
Shimodaira, H. An approximately unbiased test of phylogenetic tree selection. Syst. Biol. 51, 492–508 (2002).
Torruella, G., Galindo, L. J., Moreira, D. & López-García, P. Phylogenomics of neglected flagellated protists supports a revised eukaryotic tree of life. Curr. Biol. 35, 198–207 (2025).
Lartillot, N. & Philippe, H. Improvement of molecular phylogenetic inference and the phylogeny of Bilateria. Phil. Trans. R. Soc. B 363, 1463–1472 (2008).
Cranford-Smith, T. & Huber, D. The way is the goal: how SecA transports proteins across the cytoplasmic membrane in bacteria. FEMS Microbiol. Lett. 365, fny093 (2018).
Petrů, M., Dohnálek, V., Füssy, Z. & Doležal, P. Fates of Sec, Tat, and YidC translocases in mitochondria and other eukaryotic compartments. Mol. Biol. Evol. 38, 5241–5254 (2021).
Smets, D., Loos, M. S., Karamanou, S. & Economou, A. Protein transport across the bacterial plasma membrane by the Sec pathway. Protein J. 38, 262–273 (2019).
Hsieh, Y. et al. SecA alone can promote protein translocation and ion channel activity. J. Biol. Chem. 286, 44702–44709 (2011).
Hsieh, Y. et al. Dissecting structures and functions of SecA-only protein-conducting channels: ATPase, pore structure, ion channel activity, protein translocation, and interaction with SecYEG/SecDF•YajC. PLoS One 12, e0178307 (2017).
Köstlbacher, S., Panagiotou, K., Tamarit, D. & Ettema, T. J. G. WitChi: Efficient detection and pruning of compositional bias in phylogenomic alignments using empirical chi-squared testing. Preprint at bioRxiv https://doi.org/10.1101/2025.07.14.663642 (2025).
Tong, J. et al. Ancestral and derived protein import pathways in the mitochondrion of Reclinomonas americana. Mol. Biol. Evol. 28, 1581–1591 (2011).
Dembech, E. et al. Identification of hidden associations among eukaryotic genes through statistical analysis of coevolutionary transitions. Proc. Natl Acad. Sci. USA 120, e2218329120 (2023).
Alto, L. T. & Terman, J. R. in Semaphorin Signaling: Methods in Molecular Biology Vol. 1493 (ed. Terman, J. R.) 1–25 (Springer, 2017).
Hochstrasser, M. Origin and function of ubiquitin-like proteins. Nature 458, 422–429 (2009).
Pereira, R. V. et al. Ubiquitin-specific proteases are differentially expressed throughout the Schistosoma mansoni life cycle. Parasit. Vectors 8, 349 (2015).
Burge, R. J., Damianou, A., Wilkinson, A. J., Rodenko, B. & Mottram, J. C. Leishmania differentiation requires ubiquitin conjugation mediated by a UBC2–UEV1 E2 complex. PLoS Pathog. 16, e1008784 (2020).
Rizos, I., Frada, M. J., Bittner, L. & Not, F. Life cycle strategies in free-living unicellular eukaryotes: diversity, evolution, and current molecular tools to unravel the private life of microorganisms. J. Eukaryot. Microbiol. 71, e13052 (2024).
Hofstatter, P. G., Brown, M. W. & Lahr, D. J. G. Comparative genomics supports sex and meiosis in diverse Amoebozoa. Genome Biol. Evol. 10, 3118–3128 (2018).
Sibbald, S. J. & Archibald, J. M. More protist genomes needed. Nat. Ecol. Evol. 1, 0145 (2017).
Valt, M. & Hrubá, P. Chemical fixation of Solarion arienae for transmission electron microscopy. protocols.io https://doi.org/10.17504/protocols.io.kxygxyd5zl8j/v2 (2024).
Valt, M. HPF-FS of Solarion arienae for transmission electron microscopy. protocols.io https://doi.org/10.17504/protocols.io.dm6gpzp15lzp/v2 (2024).
Mastronarde, D. N. Automated electron microscope tomography using robust prediction of specimen movements. J. Struct. Biol. 152, 36–51 (2005).
Mastronarde, D. N. & Held, S. R. Automated tilt series alignment and tomographic reconstruction in IMOD. J. Struct. Biol. 197, 102–113 (2017).
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).
Bodian, D. A new method for staining nerve fibers and nerve endings in mounted paraffin sections. Anat. Rec. 65, 89–97 (1936).
Nie, D. Morphology and taxonomy of the intestinal protozoa of the guinea-pig, Cavia porcella. J. Morphol. 86, 381–493 (1950).
Valt, M. & Kotyk, M. Permanent specimen preparation by protargol staining. protocols.io https://doi.org/10.17504/protocols.io.q26g71or9gwz/v1 (2024).
Medlin, L., Elwood, H. J., Stickel, S. & Sogin, M. L. The characterization of enzymatically amplified eukaryotic 16S-like rRNA-coding regions. Gene 71, 491–499 (1988).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Bushmanova, E., Antipov, D., Lapidus, A. & Prjibelski, A. D. rnaSPAdes: a de novo transcriptome assembler and its application to RNA-seq data. GigaScience 8, giz100 (2019).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Criscuolo, A. & Gribaldo, S. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol. Biol. 10, 210 (2010).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010).
Guillou, L. et al. The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote small sub-unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 41, D597–D604 (2012).
Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., Von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).
Rio, D. C., Ares, M., Hannon, G. J. & Nilsen, T. W. Purification of RNA using TRIzol (TRI Reagent). Cold Spring Harb. Protoc. https://doi.org/10.1101/pdb.prot5439 (2010).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Lafond-Lapalme, J., Duceppe, M.-O., Wang, S., Moffett, P. & Mimee, B. A new method for decontamination of de novo transcriptomes using a hierarchical clustering algorithm. Bioinformatics 33, 1293–1300 (2017).
Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Huerta-Cepas, J. et al. EggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–D314 (2019).
Pánek, T. et al. A new lineage of non-photosynthetic green algae with extreme organellar genomes. BMC Biol. 20, 66 (2022).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).
Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2020).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
Wu, Y.-W., Simmons, B. A. & Singer, S. W. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 (2016).
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: assessing genomic data quality and beyond. Curr. Protoc. 1, e323 (2021).
Challis, R., Richards, E., Rajan, J., Cochrane, G. & Blaxter, M. BlobToolKit – interactive quality assessment of genome assemblies. G3 10, 1361–1374 (2020).
Gabriel, L. et al. BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 34, 769–777 (2024).
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-4.0 http://www.repeatmasker.org (2013–2015).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
Tegenfeldt, F. et al. OrthoDB and BUSCO update: annotation of orthologs with wider sampling of genomes. Nucleic Acids Res. 53, D516–D522 (2025).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Huang, N. & Li, H. compleasm: a faster and more accurate reimplementation of BUSCO. Bioinformatics 39, btad595 (2023).
Gabriel, L., Hoff, K. J., Brůna, T., Borodovsky, M. & Stanke, M. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22, 566 (2021).
Brůna, T., Gabriel, L. & Hoff, K. J. in Insect Genomics: Methods in Molecular Biology Vol. 2935 (eds Bonizzoni, M. & Ometto, L.) 67–107 (Springer, 2025).
Jones, R. E. et al. Create, analyze, and visualize phylogenomic datasets using PhyloFisher. Curr. Protoc. 4, e969 (2024).
Wang, H.-C., Minh, B. Q., Susko, E. & Roger, A. J. Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol. 67, 216–235 (2018).
Anisimova, M., Gil, M., Dufayard, J.-F., Dessimoz, C. & Gascuel, O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 60, 685–699 (2011).
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Kishino, H., Miyata, T. & Hasegawa, M. Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J. Mol. Evol. 31, 151–160 (1990).
Susko, E., Field, C., Blouin, C. & Roger, A. J. Estimation of rates-across-sites distributions in phylogenetic substitution models. Syst. Biol. 52, 594–603 (2003).
Comte, A. et al. PhylteR: efficient identification of outlier sequences in phylogenomic datasets. Mol. Biol. Evol. 40, msad234 (2023).
Shen, X.-X., Hittinger, C. T. & Rokas, A. Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nat. Ecol. Evol. 1, 0126 (2017).
Huson, D. H. et al. Dendroscope: an interactive viewer for large phylogenetic trees. BMC Bioinformatics 8, 460 (2007).
Lang, B. F. et al. Mitochondrial genome annotation with MFannot: a critical analysis of gene identification and gene model prediction. Front. Plant Sci. 14, 1222186 (2023).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).
Zimmermann, L. et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J. Mol. Biol. 430, 2237–2243 (2018).
Chan, P. P. & Lowe, T. M. in Gene Prediction: Methods in Molecular Biology Vol. 1962 (ed. Kollmar, M.) 1–14 (Springer, 2019).
Teufel, F. et al. SignalP 6.0 predicts all five types of signal peptides using protein language models. Nat. Biotechnol. 40, 1023–1025 (2022).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Steinegger, M. & Söding, J. Clustering huge protein sequence sets in linear time. Nat. Commun. 9, 2542 (2018).
Richter, D. J. et al. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Community J. 2, e56 (2022).
Soding, J., Biegert, A. & Lupas, A. N. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33, W244–W248 (2005).
Krogh, A., Larsson, B., Von Heijne, G. & Sonnhammer, E. L. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580 (2001).
Liu, Y., Schmidt, B. & Maskell, D. L. MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26, 1958–1964 (2010).
Nguyen, L. T., Schmidt, H. A., Von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Hoang, D. T., Chernomor, O., Von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018).
Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615 (2013).
Sun, J. et al. OrthoVenn3: an integrated platform for exploring and visualizing orthologous data across genomes. Nucleic Acids Res. 51, W397–W403 (2023).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Tang, S., Lomsadze, A. & Borodovsky, M. Identification of protein coding regions in RNA transcripts. Nucleic Acids Res. 43, e78 (2015).
Blum, M. et al. InterPro: the protein sequence classification resource in 2025. Nucleic Acids Res. 53, D444–D456 (2025).
Valt, M. et al. Molecular and supplementary data of Solarion arienae. Figshare https://doi.org/10.6084/m9.figshare.27182820 (2025).
Bertrand, D. et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat. Biotechnol. 37, 937–944 (2019).
Zimin, A. V. et al. The MaSuRCA genome assembler. Bioinformatics 29, 2669–2677 (2013).
Di Genova, A., Buena-Atienza, E., Ossowski, S. & Sagot, M. F. Efficient hybrid de novo assembly of human genomes with WENGAN. Nat. Biotechnol. 39, 422–430 (2021).
Field, H. I., Coulson, R. M. & Field, M. C. An automated graphics tool for comparative genomics: the Coulson plot generator. BMC Bioinformatics 14, 141 (2013).

