CN113174444A - Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof - Google Patents

Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof Download PDF

Info

Publication number
CN113174444A
CN113174444A CN202110474713.1A CN202110474713A CN113174444A CN 113174444 A CN113174444 A CN 113174444A CN 202110474713 A CN202110474713 A CN 202110474713A CN 113174444 A CN113174444 A CN 113174444A
Authority
CN
China
Prior art keywords
spp
gestational diabetes
relative abundance
intestinal bacteria
gestational
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110474713.1A
Other languages
Chinese (zh)
Inventor
胡平
潘安
潘雄飞
谌秀仪
褚旭峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202110474713.1A priority Critical patent/CN113174444A/en
Publication of CN113174444A publication Critical patent/CN113174444A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/02Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
    • C12Q1/04Determining presence or kind of microorganism; Use of selective media for testing antibiotics or bacteriocides; Compositions containing a chemical indicator therefor
    • C12Q1/06Quantitative determination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/02Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
    • C12Q1/04Determining presence or kind of microorganism; Use of selective media for testing antibiotics or bacteriocides; Compositions containing a chemical indicator therefor
    • C12Q1/14Streptococcus; Staphylococcus
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/04Endocrine or metabolic disorders
    • G01N2800/042Disorders of carbohydrate metabolism, e.g. diabetes, glucose metabolism
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/50Determining the risk of developing a disease

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Toxicology (AREA)
  • Public Health (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Botany (AREA)
  • Mycology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a gestational diabetes biomarker of intestinal bacteria in the early stage of pregnancy and screening and application thereof, belonging to the field of research on intestinal flora and human health. Comprises 27 intestinal bacteria biomarkers and 4 intestinal bacteria calculation indexes. And (3) carrying out 16S rDNA amplicon library building sequencing on the sample DNA, and analyzing the structure and function information of the intestinal flora and the correlation with gestational diabetes mellitus. Through regression model inspection, the screened 6 types of intestinal bacteria remarkably improve the prediction efficiency of the basic prediction model of the gestational diabetes. Log transformed values of the relative abundance ratio of staphylococci to class 4 short chain fatty acid producing bacteria showed an increase in the case group and were positively correlated with mid-pregnancy fasting blood glucose, 1h and 2h blood glucose levels. The biomarker can be used as a detection target for preparing a kit for predicting the onset risk of gestational diabetes in the early pregnancy period, or used as a target for preparing a medicament for treating and/or preventing gestational diabetes.

Description

Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof
Technical Field
The invention belongs to the field of research on intestinal flora and human health, and particularly relates to a series of intestinal bacteria and bacterial calculation indexes related to the onset of diabetes mellitus in early pregnancy, in particular to a gestational diabetes biomarker of the intestinal bacteria in the early pregnancy, and screening and application thereof.
Background
Along with the rapid development of Chinese economy and social transformation, the health of mothers and babies faces new problems, and the health and the pregnancy fatality of pregnant and lying-in women are influenced by risk factors such as high-nutrition diet, low physical activity, obesity, old lying-in women and the like. Wherein, the Gestational Diabetes Mellitus (GDM) is one of the main complications in pregnancy and seriously harms the health of mother and infant. The research on the pathogenesis of GDM and the search for risk factors for early prediction of GDM are of great importance for preventing and controlling GDM, reducing adverse pregnancy outcome, reducing the risk of puerperal long-term diabetes and the like.
Numerous and complex species of bacteria are present in the human intestinal environment. The change of the flora structure can lead to the change of the microecological metabolites of the whole intestinal tract and induce the corresponding feedback effect of the host. Diet, one of the major contributing factors in the development of host gut flora, is likely to be the starting point for GDM pathogenesis. Disturbances in intestinal flora with certain dietary regimes can further lead to abnormal lipid metabolism, fatty liver, and systemic inflammatory responses (Kau, A.L., southern, P.P., Griffin, N.W., Goodman, A.L. & Gordon, J.I. human Nutrition, the Gut microbiome and the immune system. Nature 474,327, 336, doi:10.1038/nature10213 (2011); Moschen, A.R., Wieser, V. & Tilg, H.Dietary pathogens: door modulators of the Gut's Microbia.gut and liver 6, 411. 416, doi:10.5009/gnl.2012.6.4.411 (411)). Evidence suggests that insulin resistance has a strong involvement in the inflammatory pathway in metabolic disorders (Johnson, A.M. & Olefsky, J.M. the origins and drivers of insulin resistance. cell 152,673- & 684, doi:10.1016/j.cell.2013.01.041(2013)), and that the gut flora is highly likely to have been altered before GDM diagnosis is confirmed, closely related to newly-developed GDM.
There is currently no study available to explore the pathogenic mechanism of GDM from the viewpoint of intestinal flora, and in particular to reveal the characteristics of intestinal flora that may be causally linked to the onset of disease in the not yet developed early pregnancy (T1). According to the data of 15 GDM patients in a related study, the overall microbial diversity of the GDM patients' intestinal flora was lower than that of the control group at T1, but no biomarker was found in the intestinal flora indicating the risk of GDM onset early in pregnancy (Koren, o.et al.host remodelling of the gut microbiome and metabolic changes dual prediction. cell 150,470-480, doi:10.1016/j.cell.2012.07.008 (2012)). However, the study design is not focused on GDM, and the related data obviously has the defects of too small sample size, unreasonable control group selection, and shortage of epidemiological and other clinical data. Therefore, on the premise of fully applying epidemiological research design, especially establishing a large prospective birth queue, researchers can perform etiological exploration series research related to GDM pathogenesis, and find out biomarkers which can be used for distinguishing GDM from non-GDM pregnant woman intestinal flora in the T1 stage.
Disclosure of Invention
To overcome the deficiencies of the prior art, it is an object of the present invention to provide a GDM biomarker screening for early gestational diabetes based intestinal microorganisms, comprising 27 screened GDM-associated early gestational intestinal bacterial species, 4 calculated indicators of intestinal bacteria, and 6 classifications of intestinal bacteria associated with reduced GDM risk. The GDM gene chip provides guidance for early diagnosis, prevention and GDM prediction of GDM, intervention treatment and accurate medication, and further helps to understand GDM pathogenesis and targeted medication and other researches.
According to a first aspect of the present invention there is provided a biomarker for gestational diabetes based on early gestational intestinal bacteria, comprising at least one of the following 27 classes of bacteria; actinobacterium spp, TM7 spp, Cyanobacterium spp, Bidobacterium spp, Coriobacteriaceae spp, Streptococcus spp, TM7-3 spp, Erysipeliostrium spp, Erysipelichiaceae spp, Phascobacter spp, Streptococcus spp, Lactobacillus spp, Gemelalluceae spp, Adlercreutzia spp, Coriobacteriaceae spp, Bifidobacterium spp, Rothbacterium spp, Actinomyces spp, Proteobacteria spp, Veonobacterium spp, Microbacterium spp, and Escherichia sp;
wherein the relative abundance of Proteobacteria spp, velonella spp, Cardiobacterium spp, phascolatobacterium spp, Lachnospira spp, Streptophyta spp and paraprevatella spp is significantly increased in the gut flora of the group of gestational diabetic patients in the early stage of pregnancy at a level of significance as the linear discriminant analysis score (LDA score) > 2.0; relative abundance of the bacteria of the genus Actinobacteria spp, TM7 spp, Cyanobacteria spp, Bifidobacterium spp, Coriobacteriaceae spp, Streptococcus spp, TM7-3 spp, erysipelothridium spp, erysipelothriceae spp, phascolatobacter spp, Streptococcus spp, Lactobacillus spp, Gemellaceae spp, adlerecrreutzia spp, Coriobacteriaceae spp, Bifidobacterium spp, rothothecia spp, Rothia spp, and Actinomyces spp, Coprococcus sp, and anamnestsis spp.
Preferably, the ratios of multivariate corrected gestational diabetes, i.e. relative risks, are 0.94, 0.74, 0.01, 0.37, 0.01 and 0.50 for each 1% increase in relative abundance associated with reduced risk of gestational diabetes.
According to another aspect of the present invention there is provided a gestational diabetes biomarker based on a differential ranking analysis of gestational diabetes of early gestational intestinal bacteria, comprising at least one of the log transformation values of the relative abundance of intestinal bacteria of the following 4 groups; log transformation values for the relative abundance of group 4 enterobacteria were log (relative abundance of Staphylococcus/relative abundance of Clostridium), log (relative abundance of Staphylococcus/relative abundance of Roseburia), log (relative abundance of Staphylococcus/relative abundance of Coriobacteriaceae), and log (relative abundance of Staphylococcus/relative abundance of lachnocristium); the biomarkers of the above 4 groups were significantly elevated in the intestinal flora of patients with gestational diabetes in early pregnancy, with a significant level P < 0.001; and the biomarkers of the above 4 groups had correlations with fasting blood glucose level, 1 hour blood glucose level and 2 hours blood glucose level in the middle of pregnancy.
According to another aspect of the invention, the application of the gestational diabetes biomarker based on the intestinal flora in the early pregnancy as a detection target in preparing a kit for predicting the onset risk of gestational diabetes in the early pregnancy is provided.
According to another aspect of the invention, the application of the biomarker for gestational diabetes based on the intestinal flora in early pregnancy as a target point in preparing a medicament for treating and/or preventing gestational diabetes is provided;
preferably, the medicament is a probiotic, prebiotic or synbiotic.
According to another aspect of the invention, the application of the gestational diabetes biomarker based on the differential sequencing analysis of gestational diabetes of the intestinal bacteria in the early pregnancy as a detection target in preparing a kit for predicting the onset risk of gestational diabetes in the early pregnancy is provided.
According to another aspect of the invention, the application of the gestational diabetes biomarker based on the differential sequencing analysis of gestational diabetes of intestinal bacteria in early pregnancy as a target point in the preparation of a medicament for treating and/or preventing gestational diabetes is provided;
preferably, the medicament is a probiotic, prebiotic or synbiotic.
According to another aspect of the present invention, there is provided the system for screening biomarkers of gestational diabetes based on intestinal bacteria in early pregnancy, comprising:
sample genome sequencing module: the sample genome sequencing module is used for performing genome sequencing on the DNA sample;
the data quality control module: the data quality control module is used for filtering original sequence data, splicing double-end sequences and removing chimera sequences;
amplicon sequence variation building block: the amplicon sequence variation construction module is used for obtaining amplicon sequence variation, recording the sequence number of the amplicon sequence variation in a sample, and obtaining the bacterial taxonomy name of the amplicon sequence variation by comparing a Greengenes database;
a community structure analysis module: the community structure analysis module is used for determining the intra-group diversity and the inter-group diversity of the microorganisms;
species composition analysis module: the species composition analysis module is used for collapsing amplicon sequence variation data and combining names of the same taxonomic level to obtain the relative abundance of the flora composition under each taxonomic level; on the other hand, filtering out sparse ASV according to the prevalence rate and sequence number of amplicon sequence variation ASV to obtain the relative abundance of the core flora composition under the ASV level;
a differential flora analysis module: the differential flora analysis module is used for carrying out standardization processing on the relative abundance of the flora compositions under each taxonomic level, and then carrying out Kruskal-Wallis rank sum test, Wilcoxon rank sum test and linear discriminant analysis to obtain the bacterial classification with significant difference among the groups, wherein the significant difference is that the score of the linear discriminant analysis is more than 2.0; on the other hand, the core flora data under the level of amplicon sequence variation ASV is applied to a microbial component analysis method, and the amplicon sequence variation with difference among the groups is screened out according to the W statistic value.
Preferably, the method further comprises the following steps:
a logistic regression model evaluation module: the logistic regression model evaluation module is used for evaluating the relevance of the single intestinal bacteria and the onset risk of the gestational diabetes through a conditional logistic regression model and screening the intestinal bacteria classification associated with the risk of newly onset gestational diabetes;
a gestational diabetes risk factor prediction module: the gestational diabetes risk factor prediction module is used for evaluating the change of the prediction effectiveness of the model prediction of the gestational diabetes after intestinal bacteria factors associated with the new gestational diabetes risk are added to the gestational diabetes basic prediction model.
According to another aspect of the present invention, there is provided a system for screening gestational diabetes biomarkers based on a differential ranking analysis of gestational diabetes by intestinal bacteria in early pregnancy, comprising:
sample genome sequencing module: the sample genome sequencing module is used for performing genome sequencing on the DNA sample;
the data quality control module: the data quality control module is used for filtering original sequence data, splicing double-end sequences and removing chimera sequences;
amplicon sequence variation building block: the amplicon sequence variation construction module is used for obtaining amplicon sequence variation, recording the sequence number of the amplicon sequence variation in a sample, and obtaining the bacterial taxonomy name of the amplicon sequence variation by comparing a Greengenes database;
species composition analysis module: the species composition analysis module is used for collapsing the amplicon sequence variation data to the level of bacterial species, merging bacterial taxonomic names at the species level, and obtaining the relative abundance of the flora composition at the bacterial species level;
a difference ordering analysis module: the differential sorting analysis module is used for sorting the bacteria classification after eliminating the influence of the absolute abundance of the bacteria according to the difference among groups of the bacteria, and calculating log conversion values of the relative abundance of 4 groups of intestinal bacteria, namely log (the relative abundance of Staphylococcus/the relative abundance of Clostridium), log (the relative abundance of Staphylococcus/Roseburia), log (the relative abundance of Staphylococcus/the relative abundance of corybacterium) and log (the relative abundance of Staphylococcus/the relative abundance of lactobacillus) respectively.
Generally, compared with the prior art, the above technical solution conceived by the present invention mainly has the following technical advantages:
(1) based on the birth queue, 27 intestinal bacteria biomarkers, 4 intestinal bacteria calculation indexes and 6 intestinal bacteria biomarkers related to GDM risk reduction are screened out by analyzing genome sequences of intestinal bacteria communities of GDM patients and non-GDM pregnant women in early pregnancy. According to the invention, a series of intestinal bacteria and related indexes related to GDM in the early pregnancy are screened, the sample size is sufficient (201 cases: 201 controls), and the statistical efficiency is more than 99%. Through regression model inspection, the screened intestinal bacteria can remarkably improve the prediction efficiency of a GDM prediction model established based on traditional risk factors. Several bacterial indicators containing staphylococcus were significantly correlated with mid-pregnancy sugar screen test indicators. The biomarker can be used as a detection target point to be applied to preparation of a detection kit for predicting the onset risk of gestational diabetes in the early pregnancy, or used as a target point to be applied to preparation of a medicament for treating and/or preventing gestational diabetes in the early pregnancy.
(2) The excrement is used as the excrement of the body and is a stable and accurate sample for researching the intestinal flora. The stool sample is researched, the differential flora of GDM patients and healthy pregnant women is identified at the early pregnancy stage without the disease of GDM, the risk prediction of GDM patients of pregnant women is carried out, and the early prevention and diagnosis are facilitated.
(3) The biomarker related to gestational diabetes mellitus provided by the invention has higher value for early prevention and early diagnosis of diseases. First, the availability, operability, safety, affordability of stool samples ensures patient compliance. Secondly, the representativeness and relative stability of the fecal sample to the intestinal flora ensures the reliability of the study results. Thirdly, fecal sample flora information is realized based on a high-throughput sequencing technology, and the obtained biomarker has higher sensitivity and specificity. Fourth, the biomarkers of the invention can also be used to monitor the effect and response of GDM patients to interventional therapy throughout pregnancy.
Drawings
Fig. 1 is an analysis process for screening GDM-associated intestinal bacteria and bacterial indicators by bioinformatics and statistical methods.
FIG. 2 is the primary axis analysis (PCoA) of the overall difference in the structures of the intestinal flora of the GDM case group and the control group in the early pregnancy. Fig. 2 (a) and fig. 2 (b) are calculated differences between groups of gut flora beta diversity based on weighted and unweighted UniFrac distances, respectively. Where permutation-multivariate analysis of variance (PERMANOVA) tests for significance between groups, P <0.05 indicates significant difference between groups. The displacement multivariate dispersion analysis (PERMDISP) tests the dispersion between two groups of samples, and the dispersion of the two groups of samples is considered consistent when P > 0.05.
Fig. 3 is a graph showing bacterial species having significant differences between GDM case groups and control groups, wherein (a) in fig. 3 is a graph showing bacterial species having significant differences between GDM case groups and control groups, with relative abundance of bacteria at a gate level; fig. 3 (b) shows bacterial species having significant differences in relative abundance of bacteria at the family level between the GDM case group and the control group.
FIG. 4 is a graph of linear discriminant analysis (LEfSe) screening of bacterial classes with significant differences in relative abundance in the intestinal flora of pregnant women in the GDM case group and the control group at early pregnancy. Blank bars represent bacteria significantly enriched in the control group, and diagonal striped bars represent bacteria significantly enriched in the GDM case group.
In FIG. 5, A is a random forest model for screening bacteria having significant difference between cases in early-stage GDM and control groups, and B in FIG. 5 is a model data-based calculation of ROC value.
Fig. 6 shows (a) dr (differential linking) analysis to exclude total bacterial load from influencing the 5 bacterial classifications in the early pregnancy period selected, and (b) fig. 6 shows the fold difference between the log transformation values of the relative abundance of intestinal bacteria in the case group (diagonal bars) and the control group (blank bars), and the correlation between the log transformation values of the relative abundance of intestinal bacteria and the Fasting Blood Glucose (FBG) level, the 1-hour blood glucose level (1h-PG) and the 2-hour blood glucose level (2h-PG) in the middle pregnancy period. r represents a correlation coefficient. Represents P <0.05, represents P <0.01, represents P <0.001, represents P < 0.0001.
FIG. 7 is a network analysis of the correlation between the intestinal flora of the control group and the intestinal flora of the case group in the early pregnancy period. Based on the correlation between bacteria in the spark cc calculation network, the same shape represents that the phylum is the same. Only bacterial species with r >0.2 and P <0.05 are shown in the figure. The solid line represents a positive correlation and the dashed line represents a negative correlation.
FIG. 8 is a GDM regression prediction model. The regression model based on the traditional GDM risk factors predicts that the efficacy reaches 0.69(0.64,0.74), and the prediction model added with the 6 types of intestinal bacteria predicts that the efficacy reaches 0.75(0.71, 0.80). The significance test shows that the prediction efficiency of the latter is obviously improved compared with that of the former.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The early pregnancy GDM biomarker based on the intestinal microorganisms comprises the following types of 27 bacteria: enterobacteria actinomycetea (Actinobacteria), TM7 phylum, Cyanobacteria (cyanobacterium), Bifidobacterium (Bifidobacterium), eupolyphagaceae (corynebacterium), Streptococcaceae (Streptococcus), genus 261 under TM7, erysipelothridium (162), genus 159 under erysiphe (erysipelotrichiaceae), genus zugium (phascolybacter, 90), genus Streptococcus (Streptococcus, 89), genus Lactobacillus (Lactobacillus, 84), genus geminidae (Gemellaceae, 74), genus alurencirus (adlerreutyzia (adlerley euvia, 27), genus eupolyphagaceae (corynebacterium), genus Bifidobacterium (Bifidobacterium, 23), genus Bifidobacterium (clostridium), genus Lactobacillus (corynebacterium) and species (Lactobacillus), reduced in one unknown species among Lactobacillus species (Lactobacillus ), Lactobacillus group (gdnococcus, 23), Lactobacillus (Lactobacillus) and Lactobacillus); whereas Proteobacteria (Proteobacteria), Veillonellaceae (Veillonellaceae), Corynebacteria (Cardiobacter, No. 226), Coorabacter (Phascolarcotobacterium, No. 143), Lachnospira (Lachnospira, No. 113), 67 under Trichoderma (Streptophyta), and Parapella (Paraprewotella, No. 54) were significantly elevated in the GDM group.
4 calculation indexes based on the relative abundance of the intestinal bacteria are screened out by a Differential Reference (DR) method: 1) log (staphylococcus relative abundance/clostridium relative abundance); 2) log (staphylococcus relative abundance/ross relative abundance); 3) log (staphylococcus relative abundance/toona sinensis relative abundance); 4) log (staphylococcus relative abundance/ralstonia relative abundance). In the calculation formula, staphylococcus is the first genus after differential ranking analysis in the GDM case group, and other 4 types of bacteria are the top ranking in the non-GDM group. The specific calculation method is referred to the relevant content of the specification.
The 6 bacterial classifications significantly associated with reduced GDM risk were screened by logistic regression models. The method specifically comprises the following steps: actinomycetea (Actinobacteria), toona sinensis (Coriobacteriaceae), gemelaceae (Gemellaceae), toona sinensis (euglena) under 26, 74 and Coprococcus (Coprococcus) under 1 unknown bacterial species. For each 1% increase in relative abundance of the above 6 intestinal bacterial classifications, the ratio of multivariable corrected ratios (OR), i.e., relative risks, were: actinomycetes (OR, 0.94; 95% CI, 0.90,0.98), Toonaceae (OR, 0.74; 95% CI,0.63,0.88), Gemini (OR, 0.01; 95% CI,0.001,0.29), Toonaceae 26 (OR, 0.37; 95%, 0.21,0.64), Gemini 74 (OR, 0.01; 95%, 0.001,0.28), and less than 1 unknown bacterial species (OR, 0.50; 95%, 0.30,0.83) of Coprococcus (Coprococcus).
The method is based on a group birth queue research project, adopts a nested case contrast research design, collects the pregnant woman feces samples diagnosed as GDM in the middle of pregnancy, matches according to age, sampling time and the like, and collects the feces samples of a non-GDM contrast group. And separating and purifying the DNA of the fecal sample, establishing a library for sequencing a 16S rDNA amplicon, and analyzing the structure and the functional information of the intestinal flora and the relevance of the intestinal flora and the GDM.
The technical scheme for realizing the invention is as follows:
(1) separating and purifying the DNA of the feces of the pregnant women at the early pregnancy stage, and carrying out high-throughput sequencing on a 16S rDNA amplicon: collecting a fecal sample, extracting DNA and controlling the quality: the consumables used for collecting the excrement are aseptic or sterilized by high temperature, high pressure and ultraviolet rays. Storing the sample in a refrigerator at-40 ℃ for 24h after collection, and transferring the cold chain to the refrigerator at-80 ℃ for storage within 3 months. And (3) separating the colony DNA of the excrement sample by using an excrement DNA extraction kit, detecting the DNA concentration by using a NanoDrop2000 ultramicro spectrophotometer, and detecting the DNA integrity by using agarose gel electrophoresis.
(2)16S rRNA gene library construction: the V3-V4 region of the 16S rRNA gene was amplified using two-step PCR to construct a library. The concentration of the mixed library was determined using a Qubit fluorescence quantifier. Illumina Miseq PE300 x 2 paired end sequencing was performed after qualification by qPCR and labchip/Agilent 2000. For quality control, a negative control (no enzyme water) and a positive control were added throughout the experiment.
(3) Intestinal bacteria and bacterial indexes related to GDM are screened by a bioinformatics method and a statistical method, and the analysis flow is shown in figure 1. Wherein:
ASV: amplicon sequence variations (Amplicon Sequencing Variants);
LEfSe: linear Discriminant Analysis Effect Size (Line discriminatant Analysis Effect Size);
and (3) analyzing microbial components: (Analysis of composition of microorganisms, ANCOM);
and (3) carrying out difference sorting analysis: (Differential Ranking, DR).
Example 1: separating and purifying fecal sample DNA
(1) Preparation of bacterial lysate
Before extracting the fecal DNA, a bacterial lysate is prepared for breaking cells in the fecal tissue so as to release the internal bacterial DNA. Bacterial lysate was placed in a 50mL centrifuge tube as follows.
Bacterial lysate system (32 mL):
Figure BDA0003046985500000111
vortex, mix well, subpackage to 8 ultrafiltration tubes, 4mL each. Centrifuging at 3500rpm/min at 4 deg.C for 30min, and discarding the filter membrane to obtain bacterial lysate. Transfer to 4 15mL centrifuge tubes and store at-20 ℃.
(2) Fecal DNA extraction
Before DNA extraction, fecal samples were centrifuged at 12000rpm/min for 10min to remove ethanol, and then fecal DNA was extracted using a Tiangen fecal DNA extraction kit (DP 328). 0.18-0.22g of a fecal sample is weighed into a 2mL Safe-lock centrifuge tube, 200 μ L of bacterial lysate is added, and the mixture is vortexed, mixed and centrifuged for a short time. Placing in 37 deg.C water bath for 1h, taking out every 30min, mixing for several times by turning upside down, and returning. After the water bath was complete, 500. mu.L of SA solution was added. Then 0.25-0.30g of the weighed glass beads are added and put into a rapid grinding apparatus to be ground for 1min (program settings: 30s of oscillation, 30s of interruption, 30s of oscillation, 30s of interruption).
After the grinding, the liquid was centrifuged for a short time in order to prevent the liquid from accumulating in the lid. Add 100. mu.L SC solution and 15. mu.L proteinase K, vortex and mix well, and centrifuge briefly. Placing in 70 deg.C water bath for 15min, taking out every 5min, opening the tube cover, releasing gas in the tube, turning upside down, mixing for several times, and returning. Placing in 95 deg.C water bath for 15min, taking out every 5min, opening the tube cover, releasing gas in the tube, turning upside down, mixing for several times, and returning.
After the water bath is finished, the centrifuge tube is stood, cooled to room temperature and then 12000rpm is carried out, after centrifugation is carried out for 3min, 500 mu L of supernatant is taken and put in a 2mL centrifuge tube. Add 10. mu.L RNaseA, mix well by inversion, centrifuge briefly, and let stand at room temperature for 5 min. Add 200. mu.L of SH, mix by inversion, centrifuge briefly, and incubate for 5min on ice. 12000rpm, after 3min centrifugation, 450. mu.L of supernatant was taken in a 1.5mL centrifuge tube. Add 450. mu.L GFA, reverse mix and centrifuge briefly.
After centrifugation, 900. mu.L of the suspension was transferred to an adsorption column CR2 at 12000rpm, and after 1min, the waste liquid was decanted. Add 500. mu.L GD to the adsorption column, 12000rpm, centrifuge for 1min and pour the waste. Add 700. mu.L PW to the adsorption column, 12000rpm, centrifuge for 1min and pour the waste. Add 700. mu.L of PW to the adsorption column, 12000rpm, centrifuge for 1min and pour the waste. Idling to 12000rpm, after 3min, putting the adsorption column into a new 1.5mL Safe-lock centrifuge tube, opening the cover and standing for 5min at room temperature. After centrifugation at 12000rpm for 3min, the solution obtained by centrifugation was again sucked into the center of the adsorption film and allowed to stand at room temperature for 5 min. Finally, centrifugation is carried out at 12000rpm for 3min to obtain fecal DNA, and the fecal DNA is stored in a refrigerator at-25 ℃.
(3) DNA quality control
The DNA concentration was measured using a NanoDrop2000 ultramicro spectrophotometer. The measurement base is cleaned by sterile water for 3 times, the sampling arm is opened, and 1mL of flick and uniform sample to be measured is sucked on the surface of the measurement base. The sampling arm is closed, so that the sample can automatically form a liquid column between the upper base and the lower base, and the DNA concentration and A260/A280 and A260/A230 are calculated. After the measurement was completed, the sampling arm was opened, the samples of the upper and lower bases were wiped off using a dust-free paper, and the measurement base was washed with sterile water 3 times. A260 is the absorbance of nucleic acids, A280 is the absorbance of proteins, and A230 is the absorbance of carbohydrates. The ratio of DNA A260/A280 with good purity should be around 1.8 and the ratio of A260/A230 > 2.0. If the purity is not satisfactory, the rhizoma Tiangen DNA purification and recovery kit (DP214) is used for purification and quality inspection again.
DNA integrity was checked by agarose gel electrophoresis. 1.5% agarose gel was prepared with 1 XTAE buffer, and 2. mu.L of DNA sample was subjected to agarose gel electrophoresis at 105V for 30 min. The result is checked by imaging of a gel imager, and the complete DNA is a clear single band with the size of about 500 bp.
Example 2: 16S rRNA gene amplicon library construction
The V3-V4 region of the 16S rRNA gene was amplified using two-step PCR to construct a library.
The first PCR step is:
the primer sequences are as follows: forward direction (341F):
5’-ACACTCTTTCCCTACACGACGCTCTTCCGATCTCCTACGGGNGGCWGCAG-3’;
reverse (805R):
5’-AGACGTGTGCTCTTCCGATCTGACTACHVGGGTATCTAATCC-3’。
the first PCR amplification reaction was performed as follows.
First step PCR amplification system (25 μ L):
Figure BDA0003046985500000131
the first step PCR reaction conditions:
Figure BDA0003046985500000132
and purifying the PCR amplification product in the first step by using a magnetic bead method to remove impurities such as primer dimer.
The second PCR step is:
the primer sequence is as follows: forward direction: 5 '-AATGATACGGCGACCACCGAGATCTACAC (Index sequence 1) ACACTCTTTCCCTACACGACG-3';
reverse 5' -CAAGCAGAAGACGGCATACGAGAT (Index sequence 2)
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3’。
Index sequences 1 and 2 are specific barcode sequences set for each sample to identify the sample.
The second PCR amplification reaction was performed as follows.
Second PCR amplification System (25. mu.L):
Figure BDA0003046985500000141
the second step PCR reaction conditions:
Figure BDA0003046985500000142
the concentration of the mixed library was determined using a Qubit fluorescence quantifier. Illumina Miseq PE300 x 2 paired end sequencing was performed after qualification by qPCR and labchip/Agilent 2000. For quality control, a negative control (no enzyme water) and a positive control (an equivalent DNA-containing mixture consisting of Rhodospirillum rubrum, Bacteroides monomorphus, Streptococcus hemolyticus, Klebsiella pneumoniae, Escherichia coli, Corynebacterium glutamicum, Burkholderia Burkholderia cepacia, and Staphylococcus aureus) were added throughout the experiment.
Example 3
Intestinal bacteria and bacterial indexes related to GDM are screened by a bioinformatics method and a statistical method, and the analysis process is briefly described as follows:
(1) sequence data processing
Sequencing the constructed library by Illumina Miseq platform to obtain original sequence data, mainly by QIIME2(Quantitative instruments to microbiological economy version 2, QIIME2) (QIIME 2)https:// qiime2.org/) And (4) processing, analyzing the significance, and performing visual presentation on the R packet.
The off-line data contains sequencing adapters and some low quality Reads (Reads numbers), and the quality control processing of the raw off-line data is required based on the plug-in of QIIME2 software. Filtration of low quality sequences, splicing of the paired sequences and removal of chimeras were all accomplished by the DADA2 insert of QIIME 2. The DADA2 is a correction algorithm based on a model, and is different from a strategy of clustering features with high sequence similarity into an operation classification unit (OTU) by a conventional UPARSE algorithm, and the core of the DADA2 algorithm aims to remove noise by a new algorithm to obtain core sequence features and is called an Amplicon Sequence Variation (ASV). Compared with the OTU, the ASV has higher precision and can match with species classes which are more branched. DADA2, when denoised, generates an ASV table in which MD5 number, which records the ASV characteristics, and the number of sequences of the ASV per sample, are common input files for a subsequent series of microbial colony analyses. Matching features in the ASV table with known biological classifications based on sequence specificity helps identify populations with significant differences. Therefore, the information of the designated V3-V4 variable region was trained by the q2-feature-classifier into a Bayesian classifier based on the sequence-classification information in the Greengenens 13.8 database, clustered at 99%. After analyzing the obtained feature sequences by using the classifier, the taxonomic names and the corresponding numbers (counts) from the boundary to the species can be obtained.
(2) Colony structure analysis
When the comparative analysis is performed, samples are required to be at the same depth, so that the ASV relative abundance table which has been generated needs to be converted by a sparse (rarefaction) method, and a dilution curve is drawn to select an appropriate sampling depth. Dilution curves to assess whether the sampling sequencing depth covers all the clusters, predict the number of ASVs observed at each sample at that depth by randomly extracting a certain number of sequences from each sample to achieve a uniform sequencing depth.
Intragroup diversity (α diversity) and intergroup diversity (β diversity) of microorganisms were determined using QIIME 2. Alpha diversity refers to the average species diversity in a region of the environment, and q2-diversity insert was used to calculate various indices, including the number of ASVs observed, the species uniformity, the Shannon diversity index, and the Simpson index. The number of ASVs observed is an index that considers only the number of species within a group, with higher numbers indicating greater abundance of sample species. The more evenly the species are distributed among individuals, the greater the uniformity of the species. The Shannon diversity index comprehensively considers the abundance and the uniformity of the community, and the higher the value is, the higher the community diversity is. The Simpson index is the probability that two randomly sampled individuals belong to different species, reflects the status and the role of dominant species in a community, and the larger the value is, the lower the community diversity is. The calculation shows that the alpha diversity indexes of the intestinal flora of the GDM case group and the intestinal flora of the control group have no significant difference (P > 0.05). As can be seen from the major axis analysis of FIG. 2, the structures of the intestinal flora of the GDM case group and the control group in the early pregnancy period are significantly different on the whole (P <0.05), and the differences (dispersion) between the samples in the case group and the control group are consistent (P > 0.05).
(3) Analysis of species composition
The relative abundance of each bacterium is calculated by the number of a certain bacterium in a certain sample/the total number of all the bacteria in a certain sample, and the prevalence rate of each bacterium is calculated by the number of samples containing the bacterium in the taxonomic level to which the bacterium belongs/the total number of samples.
Collapsing the obtained original ASV data, combining the same taxonomic level names to obtain relative abundance data of flora compositions at each level, and calculating prevalence according to occurrence times. The composition of the flora at the phylum, family and genus level depends on the relative abundance. By testing the significance of the difference of the relative abundance of the dominant intestinal bacteria in the case group and the control group, intestinal bacteria at 4 phyla and 4 family levels are screened out to have significant difference between the groups (fig. 3). At the phylum level, actinomyceta, TM7 phylum, cyanobacteria were significantly reduced in the GDM group, while proteobacteria were significantly enriched in the GDM group, see (a) in fig. 3. At the family level, bifidobacteriaceae, toona, streptococcaceae were significantly reduced in the GDM group, while veillonellaceae appeared significantly enriched, see (b) in fig. 3.
And filtering the obtained original ASV data, and deleting sparse ASV with the prevalence rate of less than 25% and/or the sequence number of less than 0.0001% of the total sequence number, wherein the prevalence rate is calculated according to the occurrence times of the ASV. And finally obtaining relative abundance data of flora composition under the ASV level.
(4) Differential flora analysis
The linear Discriminant Analysis Effect Size (LEfSe) is a commonly used method for searching for biomarkers in sequencing of 16SrRNA gene amplicons, was developed since 2011, and is widely applied due to convenience and easiness in operation. LEfSe is used for searching for inter-group differential flora from the angle of relative abundance, is commonly used for searching for inter-group identification in high-dimensional organisms, and is mainly used for searching for differential factors of two or more groups of organisms. By utilizing the LEfSe function of an online Analysis website (http:// huttenhouse. sph. harvard. edu/galaxy /), the flora with obvious difference among groups is obtained by firstly carrying out standardization processing on data and then carrying out screening in three steps of Kruskal-Wallis rank sum test, Wilcoxon rank sum test and Linear Discriminant Analysis (LDA), and the difference is represented by | LDA | > 2.0. The cladogram and LDA score map of the significant difference bacteria are generated by an online analysis website. The genus numbers are shown in Table 1. Bacterial classes with significantly different relative abundances in the intestinal flora of the pregnant women of the 17 early-pregnancy GDM case groups and the control group were screened out according to the LEfSe (figure 4). Among them, 12 types of bacteria belonging to genus 261, genus erypelieliostridium (162), genus 159 in veillonellaceae, genus zurich (90), genus streptococcus (89), genus lactobacillus (84), genus geminidae (74), genus adelercloezitosum (27), genus 26 in toona, genus bifidobacterium (23), genus rossi (15), and genus actinomyces (6) in TM7 were significantly reduced in GDM case group. By checking the relative abundance of bacteria for rank sum, 8 types of bacteria were screened for significant differences between groups (P <0.05) at both the phylum level and the family level (fig. 2). Wherein 6 types of bacteria belonging to Actinomycetes, TM7, cyanobacteria, Bifidobacterium, Toonaceae, and Streptococcus are reduced in GDM group of cases. According to the relative abundance data of the flora composition under the ASV level obtained after the prevalence rate and the sequence number are filtered, 2 ASVs which can be used for distinguishing the difference between case control groups are screened out by applying an ANCOM method according to the W statistical value, are respectively 1 unknown strain under the coprococcus genus and 1 unknown strain under the anaerobic corynebacterium genus, and are reduced in the GDM case group. LEfSe screened 12 bacterial classifications that decreased in the case group, figure 2 screened 6 bacterial classifications that decreased in the case group by the results of the relative abundance rank-sum test of bacteria, plus 2 bacterial classifications that decreased in the case group by the analysis of microbial composition, totaled 20 bacterial classifications that decreased in the case group.
On the other hand, cardiobacter (226), coralbert (143), lachnospira (113), trichoderma harzianum (67), and paraella (54) selected by the LEfSe method were significantly enriched in GDM case group (LDA score > 2.0). Relative abundance of bacteria in rank-sum test (fig. 2) selected proteobacteria and veillonellaceae were significantly enriched in GDM case group (P < 0.05). The LEfSe method screened the increased bacterial classification in the 5 case group, together with the results of the relative abundance rank-sum test of bacteria (increased bacteria in GDM class 2) in fig. 2, for a total of 7 significantly enriched bacterial classifications in the GDM case group.
TABLE 1.274 bacterial genus corresponding numbers
Figure BDA0003046985500000181
Figure BDA0003046985500000191
Figure BDA0003046985500000201
Figure BDA0003046985500000211
Figure BDA0003046985500000221
Figure BDA0003046985500000231
Figure BDA0003046985500000241
To investigate the potential semen quality-related biomarkers and verify the accuracy of the model, this study established a random forest model based on taxonomic groups (by q 2-sample-classifier). The samples were mixed in 4: the proportion of 1 is randomly divided into a training set and a testing set, and the random grid is used for searching automatic adjustment parameters, so that the model can be ensured not to over-fit data. The feature selection optimizes the model by using a recursive feature elimination method. The model was cross-validated ten times to verify the accuracy of the model. A receiver operating characteristic curve (ROC) is constructed to represent the accuracy of the model. The area enclosed by the ROC curve and the coordinate axis is the area under the curve (AUC), the AUC value range is between 0 and 1.0, generally >0.7 is taken as the model effect, the closer to 1, the higher the authenticity of the model is, and when the value is near 0.5, the model effect is poor and the authenticity is low. According to the modeling calculation of the random forest model based on the intestinal flora, the model has higher accuracy in distinguishing the GDM intestinal flora from the non-GDM intestinal flora (AUC-statistics is 0.77) (figure 5).
Because the component data has definite and constrained characteristics and does not meet the characteristics of multivariate normal distribution, the traditional statistical method fails. In order to remove the influence of the total intestinal bacteria load on the result, differential flora is searched from component data (relative abundance), an analysis method of DR (differential rank) is adopted, the component data is subjected to asymmetric log-ratio transformation (alr), the redundant dimensionality of the original component data is eliminated, the data is normalized, multiple regression modeling is carried out on interested groups, and the change of bacteria is judged according to a correlation coefficient. To further eliminate the shift in the component data and to better explain the results, we compared the model coefficient rankings of different bacteria to determine differential flora. DR analysis was done using the plugin songbird of QIIME2, and the model was evaluated according to logarithmic nature and cross-validation scores.
Figure BDA0003046985500000251
ηi=alr-1(Xiβ)
Yi~Multinomial(ηi),
The model construction comprises the following steps:
(1) performing alr conversion on each sample sequencing reading to obtain an alr vector
(2) Add a priori value centered at zero to alr vector for each sample (row)
(3) Each individual: inverse alr conversion after alr vector and covariate vector
(4) Fitting polynomial model to obtain coefficient (beta) of each component (bacteria)
(5) Beta used in comparing Rank needs to be converted by clr
Figure BDA0003046985500000252
Wherein, (alr) the addition logarithmic ratio conversion:
Figure BDA0003046985500000253
(clr) conversion of the central logarithmic ratio:
Figure BDA0003046985500000254
a total of 1 bacterial classification first in the GDM group (staphylococcus) and 4 bacterial classifications first in the non-GDM group (clostridium, rossi, toona, Lachnoclostridium) were selected according to model calculation rules, as shown in fig. 6 (a). 4 bacterial calculation indexes (relative abundance of staphylococcus as molecules, clostridium, rossi, toona and lachnocristium as denominators) were selected by applying the DR method and the foregoing calculation rules. The above 4 operational indexes were found to be significantly higher in the case group than the control group (P <0.001), and to have correlations with the Fasting Blood Glucose (FBG) level in the middle of pregnancy, the 1-hour blood glucose level (1h-PG), and the 2-hour blood glucose level (2h-PG), as shown in (b) of fig. 6. In order to construct a correlation network on the attribute level, a spark CC algorithm on the Muthor software is adopted for calculation, meanwhile, the spark CC is used for calculating the correlation of component data, and then a correlation matrix and a P value matrix are obtained. Significant positive or negative correlations between microorganisms were picked using the CoNet software. In the research, the value P of the corrected Bonferroni is less than 0.05, the correlation coefficient > is less than 0.2, the Degree value of each node is further calculated, and the variance between the high-precision semen quality value and the low-precision semen quality value is calculated. And (4) performing network visualization and network attribute analysis by using the Cytoscape 3.7.1.
And after the correlation network is constructed, analyzing and comparing the obtained parameters of the network. The common parameters have the following meanings: degree (Degree) is the most direct index for describing the centrality of Nodes in the correlation network analysis, namely the number of adjacent incident Edges (Edges) of each node. The larger the degree of a certain node is, the higher the centrality of the node is, and the more likely the node is in the position of a regulation junction in the network. Eccentricity (Eccentricity), Eccentricity is high, that is, other nodes are all in a close state. The higher the eccentricity, the more meaningful. The radiance (radiance) is obtained by calculating the shortest path between a certain node and all other nodes. The higher the irradiance, the easier it becomes to be the center of regulation by other flora. Closeness (closense), is the reciprocal of the sum of the shortest distances from one node to the other. The smaller the closure degree is, namely the more core the node is in the network, other nodes have the tendency to approach the node. Edge Betweenness is an intermediary that determines how well a node is intermediary to other nodes in the network. The higher the edge intermediary degree, the more critical the resources occupied by the node. The results of the network analysis on the members of the intestinal flora show that the number of nodes (nodes) and the number of edges (edges) among the members of the intestinal flora in the early pregnancy stage of the GDM case group are lower than those of the control group, which shows that the intestinal microecological stability of the pregnant women in the early pregnancy stage of the GDM case group is lower than that of the control group, and the mutual connection tightness between the members of the intestinal microorganism-associated network is lower than that of the control group (figure 7).
Example 4
We evaluated the association of individual intestinal bacteria with the risk of GDM onset by a conditional logistic regression model, with covariates including maternal age, gestational week, parity, BMI early pregnancy, smoking status (smoking or previous smoking, never smoking), drinking status (drinking or previous drinking in the past year, never drinking), physical activity, diet (amount of meat, vegetables, fruits, eggs), family history of diabetes, history of GDM, and use of antibiotics. The false positive rate (FDR) corrected P value was estimated using the Benjamini-Hochberg procedure. A total of 6 intestinal bacterial classifications associated with a reduced risk of new GDM development were found, including actinomycetemcomia (Actinobacteria), toona sinensis (Coriobacteriaceae), gemelaceae (Gemellaceae), No. 26 under toona sinensis, No. 74 under gemelaceae, and 1 bacterial species under Coprococcus (Coprococcus). For each 1% increase in relative abundance of the above 6 intestinal bacteria classifications, the multivariate corrected GDM odds ratio, i.e. the relative risk (OR), is: actinomycetes (OR, 0.94; 95% CI, 0.90,0.98), Toonaceae (OR, 0.74; 95% CI,0.63,0.88), Gemini (OR, 0.01; 95% CI,0.001,0.29), Toonaceae 26 (OR, 0.37; 95%, 0.21,0.64), Gemini 74 (OR, 0.01; 95%, 0.001,0.28), and Peptococcus (Coprococcus) 1 bacterial species (OR, 0.50; 95%, 0.30, 0.83).
We compared models predicting GDM based on the traditional GDM risk factor, based on 6 intestinal bacterial classifications, and based on the traditional GDM risk factor +6 intestinal bacterial classifications (fig. 8). Variables of the traditional GDM risk factor prediction model include: gestational age, gestational week, parity, smoking status, drinking status, physical activity, fasting blood glucose, family history of diabetes. Since none of the pregnant women in the non-GDM control group had a past history of GDM, no calculation of the GDM traditional factor prediction model was included. We find that the regression model after adding 6 intestinal bacteria classifications based on the traditional model predicts the efficiency to be remarkably improved (P <0.001), the C-statistic value reaches 0.75(0.71,0.80), and the traditional GDM prediction model predicts the efficiency to be 0.69(0.64, 0.74).
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A biomarker of early gestational diabetes based on intestinal bacteria, comprising at least one of the following 27 types of bacteria; actinobacterium spp, TM7 spp, Cyanobacterium spp, Bidobacterium spp, Coriobacteriaceae spp, Streptococcus spp, TM7-3 spp, Erysipeliostrium spp, Erysipelichiaceae spp, Phascobacter spp, Streptococcus spp, Lactobacillus spp, Gemelalluceae spp, Adlercreutzia spp, Coriobacteriaceae spp, Bifidobacterium spp, Rothbacterium spp, Actinomyces spp, Proteobacteria spp, Veonobacterium spp, Microbacterium spp, and Escherichia sp;
wherein the relative abundance of Proteobacteria spp, velonella spp, Cardiobacterium spp, phascolatobacterium spp, Lachnospira spp, Streptophyta spp and Paraprevotella spp is increased in the gut flora of the group of patients with gestational diabetes during pregnancy; actinobacterium spp., TM7 spp., Cyanobacterium spp., Bifidobacterium spp., Coriobacteriaceae spp., Streptococcus spp., TM7-3 spp., Erysipelotridium spp., Erysipelrichaceae spp., Phascolatobacterium spp., Streptococcus spp., Lactobacillus spp., Gemedales spp., Adlercreutzia spp., Coriobacteriaceae spp., Bifidobacterium spp., Rothbacterium spp., Actinomyces spp., Coprococcus sp., and Anaerobacterium spp.
2. The early pregnancy based enterobacteria gestational diabetes biomarker of claim 1, wherein Actinobacterium spp.
3. A gestational diabetes biomarker based on a differential ranking analysis of gestational diabetes of early gestational intestinal bacteria, characterized by comprising at least one of the log transformation values of the relative abundance of intestinal bacteria of the following 4 groups; log transformation values for the relative abundance of group 4 enterobacteria were log (relative abundance of Staphylococcus/relative abundance of Clostridium), log (relative abundance of Staphylococcus/relative abundance of Roseburia), log (relative abundance of Staphylococcus/relative abundance of Coriobacteriaceae), and log (relative abundance of Staphylococcus/relative abundance of lachnocristium); the biomarkers of the above 4 groups were significantly elevated in the intestinal flora of patients with gestational diabetes in early pregnancy, with a significant level P < 0.001; and the biomarkers of the above 4 groups had correlations with fasting blood glucose level, 1 hour blood glucose level and 2 hours blood glucose level in the middle of pregnancy.
4. Use of the early gestational diabetes biomarker based on the intestinal flora in the early gestational period as described in claim 1 or 2 as a detection target in preparation of a kit for predicting the onset risk of gestational diabetes in the early gestational period.
5. Use of the biomarker for gestational diabetes based on the intestinal flora at early pregnancy according to claim 1 or 2 as a target for preparing a medicament for treating and/or preventing gestational diabetes;
preferably, the medicament is a probiotic, prebiotic or synbiotic.
6. Use of the gestational diabetes biomarker based on the differential ranking analysis of gestational diabetes by early gestational intestinal bacteria according to claim 3 as a detection target in preparing a kit for predicting the onset risk of gestational diabetes in early gestational period.
7. Use of the gestational diabetes biomarker based on the differential ordering analysis of gestational diabetes by early gestational intestinal bacteria according to claim 3 as a target for preparing a medicament for treating and/or preventing gestational diabetes;
preferably, the medicament is a probiotic, prebiotic or synbiotic.
8. The screening system for gestational diabetes biomarkers based on early gestational intestinal bacteria according to claim 1 or 2, comprising:
sample genome sequencing module: the sample genome sequencing module is used for performing genome sequencing on the DNA sample;
the data quality control module: the data quality control module is used for filtering original sequence data, splicing double-end sequences and removing chimera sequences;
amplicon sequence variation building block: the amplicon sequence variation construction module is used for obtaining amplicon sequence variation, recording the sequence number of the amplicon sequence variation in a sample, and obtaining the bacterial taxonomy name of the amplicon sequence variation by comparing a Greengenes database;
a community structure analysis module: the community structure analysis module is used for determining the intra-group diversity and the inter-group diversity of the microorganisms;
species composition analysis module: the species composition analysis module is used for filtering and collapsing amplicon sequence variation original data, and combining names of the same taxonomic level to obtain the relative abundance of flora compositions under each taxonomic level; on the other hand, the original data of the amplicon sequence variation is filtered to obtain the relative abundance of the flora composition under the amplicon sequence variation level;
a differential flora analysis module: the differential flora analysis module is used for carrying out standardization processing on the relative abundance of the flora compositions under each taxonomic level, and then carrying out Kruskal-Wallis rank sum test, Wilcoxon rank sum test and linear discriminant analysis to obtain the bacterial classification with significant difference among the groups, wherein the significant difference is that the score of the linear discriminant analysis is more than 2.0; on the other hand, amplicon sequence variations that differ between the partitions were screened based on the W statistic using a microbial composition analysis method.
9. The screening system of claim 8, further comprising:
a logistic regression model evaluation module: the logistic regression model evaluation module is used for evaluating the relevance of the single-type intestinal bacteria and the onset risk of the gestational diabetes through a conditional logistic regression model and screening the intestinal bacteria classification associated with the risk of the newly-onset gestational diabetes;
a gestational diabetes risk factor prediction module: the gestational diabetes risk factor prediction module is used for evaluating the change of the prediction effectiveness of the model prediction of the gestational diabetes after intestinal bacteria factors associated with the new gestational diabetes risk are added to the gestational diabetes basic prediction model.
10. The system for screening biomarkers of gestational diabetes based on a differential ranking analysis of gestational diabetes by early gestational intestinal bacteria according to claim 3, comprising:
sample genome sequencing module: the sample genome sequencing module is used for performing genome sequencing on the DNA sample;
the data quality control module: the data quality control module is used for filtering original sequence data, splicing double-end sequences and removing chimera sequences;
amplicon sequence variation building block: the amplicon sequence variation construction module is used for obtaining amplicon sequence variation, recording the sequence number of the amplicon sequence variation in a sample, and obtaining the bacterial taxonomy name of the amplicon sequence variation by comparing a Greengenes database;
species composition analysis module: the species composition analysis module is used for collapsing the amplicon sequence variation data to the level of the bacterial species and combining the names of the same species level to obtain the relative abundance of the flora composition at the level of the bacterial species;
a difference ordering analysis module: the differential sorting analysis module is used for sorting the bacteria classification after eliminating the influence of the absolute abundance of the bacteria according to the difference among groups of the bacteria, and calculating log conversion values of the relative abundance of 4 groups of intestinal bacteria, namely log (the relative abundance of Staphylococcus/the relative abundance of Clostridium), log (the relative abundance of Staphylococcus/Roseburia), log (the relative abundance of Staphylococcus/the relative abundance of corybacterium) and log (the relative abundance of Staphylococcus/the relative abundance of lactobacillus) respectively.
CN202110474713.1A 2021-04-29 2021-04-29 Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof Pending CN113174444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110474713.1A CN113174444A (en) 2021-04-29 2021-04-29 Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110474713.1A CN113174444A (en) 2021-04-29 2021-04-29 Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof

Publications (1)

Publication Number Publication Date
CN113174444A true CN113174444A (en) 2021-07-27

Family

ID=76925488

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110474713.1A Pending CN113174444A (en) 2021-04-29 2021-04-29 Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof

Country Status (1)

Country Link
CN (1) CN113174444A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114034781A (en) * 2021-09-16 2022-02-11 中日友好医院 Biomarker, method and early warning model for predicting gestational diabetes in early pregnancy
CN114038503A (en) * 2021-10-26 2022-02-11 艾德范思(北京)医学检验实验室有限公司 Human breast milk sample characteristic bacteria data analysis and identification method based on high-throughput sequencing
CN114317671A (en) * 2021-12-27 2022-04-12 复旦大学附属儿科医院 Intestinal bacteria and fecal metabolites capable of being used as biomarkers of type 1diabetes and application thereof
CN114540483A (en) * 2022-02-24 2022-05-27 山东体育学院 Intestinal flora-based hydroxytyrosol biomarker for improving oxidative damage caused by aerobic exercise and application thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102743420A (en) * 2012-06-06 2012-10-24 上海交通大学 Method for improving intestinal colony structure and application
CN109897906A (en) * 2019-03-04 2019-06-18 福建西陇生物技术有限公司 A kind of detection method and its application of intestinal flora 16S rRNA gene
CN110408699A (en) * 2019-07-11 2019-11-05 福建卫生职业技术学院 Intestinal cancer intestinal flora marker and its application

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102743420A (en) * 2012-06-06 2012-10-24 上海交通大学 Method for improving intestinal colony structure and application
CN109897906A (en) * 2019-03-04 2019-06-18 福建西陇生物技术有限公司 A kind of detection method and its application of intestinal flora 16S rRNA gene
CN110408699A (en) * 2019-07-11 2019-11-05 福建卫生职业技术学院 Intestinal cancer intestinal flora marker and its application

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HASAIN Z, MOKHTAR NM, KAMARUDDIN NA等: "Gut Microbiota and Gestational Diabetes Mellitus_ A Review of Host-Gut Microbiota Interactions and Their Therapeutic Potential", 《FRONT CELL INFECT MICROBIOL》, 15 May 2020 (2020-05-15), pages 1 - 19 *
PING HU,XIUYI CHEN,XUFENG CHU等: "Association Of Gut Microbiota During Early Pregnancy With Risk Of Incident Gestational Diabetes Mellitus: A Nested Case-Control Study", 《RESEARCH SQUARE》, 29 July 2020 (2020-07-29), pages 1 - 24 *
TANG N, LUO ZC, ZHANG L等: "The Association Between Gestational Diabetes and Microbiota in Placenta and Cord Blood", 《FRONT ENDOCRINOL (LAUSANNE)》, 21 October 2020 (2020-10-21), pages 1 - 11 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114034781A (en) * 2021-09-16 2022-02-11 中日友好医院 Biomarker, method and early warning model for predicting gestational diabetes in early pregnancy
CN114038503A (en) * 2021-10-26 2022-02-11 艾德范思(北京)医学检验实验室有限公司 Human breast milk sample characteristic bacteria data analysis and identification method based on high-throughput sequencing
CN114317671A (en) * 2021-12-27 2022-04-12 复旦大学附属儿科医院 Intestinal bacteria and fecal metabolites capable of being used as biomarkers of type 1diabetes and application thereof
CN114317671B (en) * 2021-12-27 2024-04-16 复旦大学附属儿科医院 Intestinal bacteria and fecal metabolites useful as biomarkers for type 1diabetes and uses thereof
CN114540483A (en) * 2022-02-24 2022-05-27 山东体育学院 Intestinal flora-based hydroxytyrosol biomarker for improving oxidative damage caused by aerobic exercise and application thereof

Similar Documents

Publication Publication Date Title
CN113174444A (en) Gestational diabetes biomarker of intestinal bacteria in early pregnancy and screening and application thereof
Zhang et al. Incremental value of metagenomic next generation sequencing for the diagnosis of suspected focal infection in adults
Chu et al. Maturation of the infant microbiome community structure and function across multiple body sites and in relation to mode of delivery
Fricke et al. Human microbiota characterization in the course of renal transplantation
Dong et al. A comparative study of the gut microbiota associated with immunoglobulin a nephropathy and membranous nephropathy
AU2016240362B9 (en) Method for determining gastrointestinal tract dysbiosis
Chiu et al. Airway microbial diversity is inversely associated with mite-sensitized rhinitis and asthma in early childhood
CN105368944B (en) Biomarker of detectable disease and application thereof
CN104540962B (en) Diabetes biomarker and its application
Wu et al. Metagenomic analysis reveals gestational diabetes mellitus-related microbial regulators of glucose tolerance
Ingham et al. Microbiota long-term dynamics and prediction of acute graft-versus-host disease in pediatric allogeneic stem cell transplantation
Iaffaldano et al. Oropharyngeal microbiome evaluation highlights Neisseria abundance in active celiac patients
CN107217089B (en) Method and device for determining individual state
Wernroth et al. Development of gut microbiota during the first 2 years of life
CN107075446B (en) Biomarkers for obesity related diseases
Baud et al. Microbial diversity in the vaginal microbiota and its link to pregnancy outcomes
Nguyen et al. Associations between the gut microbiome and metabolome in early life
CN107075453B (en) Biomarkers for coronary artery disease
Maslove et al. Validation of diagnostic gene sets to identify critically ill patients with sepsis
Heida et al. Weight shapes the intestinal microbiome in preterm infants: results of a prospective observational study
CN107217088B (en) Ankylosing spondylitis microbial markers
Zhao et al. Microbiome data enhances predictive models of lung function in people with cystic fibrosis
Kwak et al. Development of a NOVEL metagenomic biomarker for prediction of upper gastrointestinal tract involvement in patients with Crohn’s disease
Leite et al. Defining small intestinal bacterial overgrowth by culture and high throughput sequencing
Nickel et al. The bacterial microbiota of Hunner lesion interstitial cystitis/bladder pain syndrome

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210727