Ding He, PhD Student
General Idea
Eukaryotes are complex celled organisms with many internal structures. One of the most important of these is the mitochondrion, which most eukaryotes depend on for generating most of their energy. There is now abundant evidence that mitochondria are derived from once free-living bacteria by a process called endosymbiosis. Part of this process involved the transfer of many bacterial genes from the young symbiont to the host nucleus, allowing the genes to fall under nuclear regulation (but still encoding proteins that are post-translationally targeted to the organelle). This suggests that there should be a universal set of nuclear- encoded proteins of bacterial origin common to all mitochondriate eukaryotes. However, it has been very difficult to identify these genes using conventional genetic methods because they tend to produce lethal phenotypes. Therefore the nuclear-encoded mitochondrial (NcMt) proteome is one of the most poorly understood aspects of eukaryotic cells. The goal of my project is to define the core components of the NcMt proteome by identifying the universal set of nuclear-encoded mitochondrial proteins. These data will then be used to tackle another important outstanding question in eukaryote evolution, the position of the root of the eukaryote tree of life.
My research is aimed at identifying the universal NcMt genes of eukaryotes using a bioinformatics approach. I am utilizing eukaryotes with completely sequenced and well- annotated genomes which are now available for a wide variety of species spanning the eukaryote tree. These genes can then be used to reconstruct the deep phylogeny of eukaryotes, and to investigate the molecular evolution of mitochondria. The project is therefore divided into following steps:
Part 1. Candidate gene identification.
Based on the theory that mitochondria was gradually evolved from bacteria, I am using an indirect approach by identifying and classifying the bacterial component in eukaryotic genomes, that is the protein coding genes in eukaryotic nuclear genomes that are bacterial in origin. Eukaryotic nuclear-encoded bacterial proteins, euBacs, will be identified from a set of completely sequenced and well-annotated eukaryotic proteomes. All euBacs will be analyzed for their universality spanning the eukaryotic tree.
Part 2. Molecular evolution of nuclear-encoded mitochondrial genes.
The universal set of euBacs identified in the first part of the project will be examined to determine which are targeted to the mitochondrion. This will be done by comparing the results of proteomic studies or using various subcellular localization predication tools.
Part 3. Deep phylogenetic relationships of major eukaryotic groups.
All universal euBacs will be examined by detailed phylogenetic analysis to determine if all proteins have a congruent phylogenetic signal. For all phylogenetically congruent protein coding genes the data sets will be expanded to include all available eukaryotic sequences and set of outgroup bacteria. The sequences will then be concatenated and analyzed to test the deep phylogeny of eukaryotes using a variety of phylogenetic methods and evolutionary models, plus various appropriate tests of phylogenetic robustness.
- The Tree of Life – and how to build it 10 hp. I work as a teaching assistant.
Feel free to contact me with questions, both popular and scientific.



Ding He