PhD Defense by Congmin Xu

Primary tabs

Congmin Xu

BME Ph.D. Thesis Defense


Date: June 14th, 2020

Time: 9:00 PM

Location: Peking University, College of Engineering, Building No 1, Room 212

Bluejeans link: https://bluejeans.com/8103233104/


Advisor: Huaiqiu Zhu, Ph.D. and Peng Qiu, Ph.D.


Community members:

Huaiqiu Zhu, Ph.D. (advisor)

Peng Qiu, Ph.D. (co-advisor)

Chenggang Zhang, Ph.D.

Liping Duan, M.D.

Ziding Zhang, Ph.D.

Jianzhong Xi, Ph.D.

Zhifei Dai, Ph.D.


Title: Metagenomics analysis of disease-related human gut microbiota


Abstract: The human gut microbiota have been linked with various pathological disorders. Yet, our understanding of the underlying mechanisms is still limited by the inconsistent results of different publications and the inherent complexity. These separate studies and incomparable data sets missed the forest for the trees, thus encouraging us to carry out meta-analysis of human gut microbiome regarding different kinds of diseases and dip into the question about what kinds of human gut microbial community are healthy.

  1. This dissertation underpins the consistent discipline behind disease-related dysbiosis by conducting a pan-microbiome analysis, which annotated and analyzed the microbiome contigs and genes identified from raw reads of whole genome sequencing (WGS) data of human gut. Consistent pattern shift was discovered in the microbial mutually dependent community, which revealed that the microbial members in diseases are more competitive while less cooperative than health, remarkably driven by the 20-times increase of competitive pairs between potential pathogens and 10-times decrease of cooperative pairs between non pathogens. Additionally, taking all the microbiota in the same community as a ‘super organism’, our mathematical model of gene-gene interaction network revealed the significance of cell motility, though it was not a dominant functional category. This part of work answered the question about how the ecological niches of gut modulate human health in a systematic matter.
  2. This dissertation discovered some inflammation and cancer related genera increase in the advanced aging individuals while some beneficial genera are lost, and proved the existence of aging progression of human gut microbiota, by applying an unsupervised machine learning algorithm to recapitulate the underlying aging progression of microbial community from hosts in different age groups. Aging process captures many facets of biological variation of the human body, which leads to functional decline and increased incidence of infection in gut of elderly people. Different from diseases, the aging transformation is a continuous progress. We obtained raw 16S rRNA sequencing data of subjects ranging from newborns to centenarians from a previous study, and summarized the data into a relative abundance matrix of genera in all the samples. Without using the age information of samples, we applied multivariate unsupervised analysis, which revealed the existence of a continuous aging progression of human gut microbiota along with the host aging process. The identified genera associated to this aging process are meaningful for designing probiotics to maintain the gut microbiota to resemble a young age, which hopefully will lead to positive impact on human health, especially for individuals in advanced age groups.
  3. This dissertation develops a machine learning model LightCUD for disease discrimination based on human gut microbiome, which was designed for discriminating UC and CD from non-IBD colitis. Using a set of WGS data from 349 human gut microbiota samples with two types of IBD and healthy controls, we assembled and aligned WGS short reads to obtain feature profiles. Owing to the well-designed feature selection and machine learning algorithms comparison, LightCUD outperforms other pilot studies. LightCUD was implemented in Python and packaged free for installation with customized databases. With WGS data or 16S rDNA sequencing data of gut microbiota samples as the input, LightCUD can discriminate IBD from healthy controls with high accuracy and further identify the specific type of IBD. The executable program LightCUD is released as open source at the webpage http://cqb.pku.edu.cn/Zhulab/lightcud/.
  4. This dissertation constructed a comprehensive database, named DREEM, of DiseaseRElatEd Marker genes in human gut microbiome, which retrieves a large scale WGS data released in GeneBank and EMBL. Short reads with the size of 18.63T consisting of 1,729 samples are processed with unified procedure, involving the state-of-the-art bioinformatics tools and well-designed statistical analysis, and covering six types of pathological conditions, i.e., T2D, Crohn’ s diseases, ulcerative colitis, liver cirrhosis, symptomatic atherosclerosis and obesity. Furthermore, the database annotates the disease-related marker genes functionally and taxonomically. DREEM contains 1,953,046 disease-related marker genes and 5100 core genes. The database is accessible at http://cqb.pku.edu.cn/ZhuLab/DREEM.


This dissertation conducted a pan-microbiome analysis integrating multiple diseases, revealed the aging progression of human gut microbiota, released the tool LightCUD for discriminating diseases based on human gut microbiome and constructed a disease-related marker gene database within human gut microbiota.


  • Workflow Status:
  • Created By:
    Tatianna Richardson
  • Created:
  • Modified By:
    Tatianna Richardson
  • Modified:


Target Audience

    No target audience selected.