CN107858416B - Biomarker combination for detecting endometriosis and application thereof - Google Patents

Biomarker combination for detecting endometriosis and application thereof Download PDF

Info

Publication number
CN107858416B
CN107858416B CN201610831390.6A CN201610831390A CN107858416B CN 107858416 B CN107858416 B CN 107858416B CN 201610831390 A CN201610831390 A CN 201610831390A CN 107858416 B CN107858416 B CN 107858416B
Authority
CN
China
Prior art keywords
endometriosis
seq
marker
biomarker
nucleic acids
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610831390.6A
Other languages
Chinese (zh)
Other versions
CN107858416A (en
Inventor
贾慧珏
钟焕姿
宋晓蕾
王子榕
陈晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to CN201610831390.6A priority Critical patent/CN107858416B/en
Priority to CN201780047955.4A priority patent/CN109715828B/en
Priority to PCT/CN2017/096249 priority patent/WO2018049947A1/en
Publication of CN107858416A publication Critical patent/CN107858416A/en
Application granted granted Critical
Publication of CN107858416B publication Critical patent/CN107858416B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The application discloses a biomarker combination for detecting endometriosis or evaluating the risk of endometriosis and application thereof. The biomarker combination comprises at least one of twenty-four nucleic acids, wherein the twenty-four nucleic acids are respectively shown as Seq ID No.1 to Seq ID No.24, or are respectively shown as sequences with similarity of more than 97% with the sequences shown as Seq ID No.1 to Seq ID No. 24. The biomarker combination has good specificity, can be used for early diagnosis of endometriosis, and provides a new way for endometriosis detection or risk assessment. In addition, the biomarker combination has the advantages of high sensitivity and high specificity, is good in repeatability and has important application value; the genital tract sample is used as the biomarker detection sample, and has the advantages of convenient material taking, simple operation steps, continuous in vitro detection and the like.

Description

Biomarker combination for detecting endometriosis and application thereof
Technical Field
The application relates to the field of biomarkers, in particular to a biomarker combination for detecting endometriosis or evaluating the risk of endometriosis and application thereof.
Background
Endometriosis is a frequently encountered disease in women of childbearing age, causing a variety of clinical symptoms. The annual incidence rate is obviously increased and reaches 10 to 15 percent, which accounts for more than 30 percent of the gynecological operation. Dysmenorrhea, hypogastric pain and infertility caused by endometriosis seriously affect the health and quality of life of women. Although endometriosis is a benign disease, it has wide pathological changes, various forms, high invasiveness and recurrence, and damages to the organ structure and function equivalent to that of malignant tumor, so it is called "benign cancer". The clinical manifestations of endometriosis are not directly proportional to the extent of the disease and therefore there is a lack of effective clinical diagnostic methods.
For endometriosis, although some of the currently applied clinical markers, such as carcinoembryonic antigen 125(CA125), placental protein 14(PP14), endometrial antibody, and interleukin 6(IL-6), have a certain reference value for clinical diagnosis and recurrence tracking, the specificity is still lacking.
Therefore, the search for sensitive and specific biomarkers of endometriosis is an urgent problem to be solved at present.
Disclosure of Invention
The application aims to provide a biomarker for detecting endometriosis and a preparation method and application thereof.
In order to achieve the purpose, the following technical scheme is adopted in the application:
one aspect of the present application discloses a biomarker for detecting endometriosis, the biomarker combination comprising at least one of twenty-four nucleic acids, the twenty-four nucleic acids being sequences shown in Seq ID No.1 to Seq ID No.24, respectively, or sequences having a similarity of 97% or more to the sequences shown in Seq ID No.1 to Seq ID No.24, respectively.
It should be noted that twenty-four nucleic acids of the present application are nucleic acid sequences which are studied and associated with endometriosis, wherein each nucleic acid sequence is associated with endometriosis, and therefore, can be used alone or in combination for endometriosis detection or disease risk assessment, regardless of judgment accuracy or with low requirements for the same. However, in a preferred embodiment of the present application, not only the twenty-four nucleic acids are used together, but also the twenty-four nucleic acids are classified according to a specific rule and divided into a plurality of marker sets, and the marker sets are used together for endometriosis detection or disease risk assessment, which will be described in detail in a preferred embodiment later.
It should be further noted that the twenty-four nucleic acids of the present application are subjected to cluster analysis based on similarity of more than 97%, and then the most representative sequence is selected from each taxon (abbreviated as OTU) as a seed sequence, wherein the twenty-four seed sequences having association with endometriosis, i.e. the biomarker combinations constituting the present application; therefore, in the biomarker combinations of the present application, twenty-four nucleic acids are not limited to the sequences shown in Seq ID No.1 to Seq ID No.24, but may be sequences having a similarity of 97% or more to the sequences shown in Seq ID No.1 to Seq ID No. 24.
It should be added that, the biomarker combination for detecting endometriosis or evaluating the risk of endometriosis according to the present invention is not directly used for detecting endometriosis or evaluating the risk of endometriosis according to the presence or absence of the detected biomarker combination, but after the biomarker combination is detected, the relative abundance of the biomarker combination is analyzed and substituted into a random forest model for judgment, and whether the subject suffers from endometriosis or the risk of endometriosis of the subject is evaluated according to the probability output by the random forest model, which will be described in detail in the following technical solution.
Preferably, another aspect of the present application discloses a biomarker combination for endometriosis detection or risk assessment, comprising at least one of a first marker panel, a second marker panel and a third marker panel; the first marker group is composed of fourteen nucleic acids which are sequences shown as Seq ID No.1, Seq ID No.2, Seq ID No.6, Seq ID No.7, Seq ID No.12, Seq ID No.13, Seq ID No.15, Seq ID No.17 to Seq ID No.22, and Seq ID No.24, respectively, or sequences having a similarity of 97% or more to the sequences shown as Seq ID No.1, Seq ID No.2, Seq ID No.6, Seq ID No.7, Seq ID No.12, Seq ID No.13, Seq ID No.15, Seq ID No.17 to Seq ID No.22, and Seq ID No.24, respectively; the second marker group consists of two nucleic acids, wherein the two nucleic acids are sequences shown in Seq ID No.1 and Seq ID No.7 respectively, or sequences with more than 97% similarity to the sequences shown in Seq ID No.1 and Seq ID No.7 respectively; the third marker set is composed of eleven nucleic acids which are sequences shown by Seq ID No.3 to Seq ID No.5, Seq ID No.8 to Seq ID No.12, Seq ID No.14, Seq ID No.16, and Seq ID No.23, respectively, or sequences having a similarity of 97% or more to the sequences shown by Seq ID No.3 to Seq ID No.5, Seq ID No.8 to Seq ID No.12, Seq ID No.14, Seq ID No.16, and Seq ID No.23, respectively.
It should be noted that in the preferred embodiment of the present application, twenty-four nucleic acids are repeatedly selected and divided into three marker sets, i.e., a first marker set, a second marker set and a third marker set; the accuracy of detecting endometriosis or evaluating the risk of endometriosis by the biomarker combination can be greatly improved by comprehensively judging the three marker groups.
Preferably, the first marker panel is a CL marker panel for use in the detection of endometriosis or risk assessment of endometriosis in a sample from subinvolus 1/3.
Preferably, the second marker set is a CU marker set for use in endometriosis detection or risk assessment of a sample from the posterior fornix of the vagina.
Preferably, the third marker panel is a CV marker panel for use in detecting endometriosis or assessing risk of acquiring endometriosis in a sample from a cervical canal.
It is noted that the twenty-four nucleic acids in the biomarker combinations of the present application actually represent 14 microorganisms at three sites, the subtagic 1/3, the posterior fornix of the vagina, and the cervical canal; twenty-four nucleic acids of 14 microorganisms at three positions of 1/3 under the vagina, fornices vaginae and a cervical canal are detected, the relation between the relative abundance and endometriosis is statistically analyzed, a random forest model is established, and whether a to-be-detected object suffers from endometriosis or risks of endometriosis or not is judged. Therefore, the three marker sets actually correspond to the three sampling sites respectively; samples from three sites were analyzed and judged independently for each marker group. Only, the accuracy of detecting endometriosis or evaluating the risk of endometriosis by the biomarker combination can be improved by comprehensively judging according to the results of the three.
It should be noted that, in three parts of the vagina, namely 1/3, fornix vaginae and cervical canal, the number of microorganisms is far more than 14, and the nucleic acid of 14 microorganisms is far more than 24 as described in the present application; however, the application screens twenty-four nucleic acids of 14 microorganisms from the random forest model to serve as biomarkers for detecting endometriosis, and provides a new approach for detecting and evaluating endometriosis.
It should be noted that, of the three marker sets, the CL marker set is the marker set of the sample of the vaginal 1/3, and the vagina 1/3 is abbreviated as CL; CU marker set, i.e. marker set of posterior fornix sample, posterior fornix of vagina abbreviated CU; CV marker set is the marker set for cervical canal samples, abbreviated CV.
The other side of the application discloses a kit for detecting endometriosis or evaluating the risk of endometriosis, which comprises a primer pair for detecting the biomarker combination of the application, wherein the forward primer of the primer pair is a sequence shown in SEQ ID No.25, and the reverse primer is a sequence shown in SEQ ID No. 26.
It should be noted that the biomarker combination of the present application can be present in the kit as a standard reference, and the primer pair is directly used for PCR amplification of the biomarker combination in the sample to be tested.
The application also discloses the application of the biomarker combination in endometriosis drug screening or in preparation of a kit or a detection tool for endometriosis detection or disease risk assessment.
It will be appreciated that the biomarker combinations of the present application are themselves studied for endometriosis and may of course be used in the detection or risk assessment of endometriosis; the biomarker combinations of the present application may also be incorporated into kits or tools specifically designed for the detection of endometriosis to facilitate the detection and assessment of endometriosis, and it is within the scope of the present application to employ the biomarker combinations of the present application. Meanwhile, the biomarker combination can detect endometriosis or carry out disease risk assessment on endometriosis; certainly, the pathological conditions or the pathological risk changes of the endometriosis before and after the administration can be contrasted and detected, so that whether the used medicament is effective or not can be judged, and the purpose of medicament screening can be achieved.
The application further discloses an application of the method for judging endometriosis by detecting the biomarkers in preparing a kit or a tool for detecting endometriosis or evaluating the risk of endometriosis; wherein the biomarker is a biomarker combination of the present application;
the method for judging endometriosis by detecting biomarkers comprises the following steps,
(1) performing sample collection on an object to be detected, detecting the biomarker combinations in the collected samples, and analyzing the levels of all nucleic acids in the biomarker combinations;
(2) comparing the level of each nucleic acid measured in step (1) with a reference data set or reference value to obtain a test result;
preferably, the level of each nucleic acid is the relative abundance of each nucleic acid; the reference data set or reference value is the level of each nucleic acid in the biomarker combination derived from an endometriosis patient and a non-endometriosis control.
More preferably, the reference data set or reference value in step (2) is at least one of table 5, table 6 or table 7; comparing the level of each nucleic acid with a reference data set or a reference value to obtain a detection result, specifically including calculating a prevalence probability using a multivariate statistical model, preferably, the multivariate statistical model is a random forest model.
In a further aspect of the present application, there is disclosed a method of screening a candidate drug for the treatment of endometriosis comprising the steps of,
1) determining the biomarker combinations of the present application in the pre-and post-dose samples, respectively, and analyzing the levels of each nucleic acid in the biomarker combinations;
2) determining candidate drugs based on comparing the levels of each nucleic acid in the pre-and post-dose samples;
in step 2), comparing the levels of the nucleic acids in the sample before and after administration, specifically comprising calculating the prevalence probability by using a multivariate statistical model, preferably, the multivariate statistical model is a random forest model.
Due to the adoption of the technical scheme, the beneficial effects of the application are as follows:
the biomarker for detecting the endometriosis has good specificity, can well detect the endometriosis, provides a new way for detecting or evaluating the risk of the endometriosis, and can be used for early diagnosis of the endometriosis.
Other major advantages of the present application include:
(a) the biomarker combination is used for detecting endometriosis or evaluating the risk of endometriosis, has the advantages of high sensitivity and high specificity, and has important application value.
(b) The genital tract sample as the biomarker combined detection sample has the advantages of convenient material taking, simple operation steps, continuous in vitro detection and the like.
(c) The biomarker combination has the characteristic of good repeatability when used for detecting endometriosis or evaluating the disease risk.
Drawings
Fig. 1 is a graph of the results of identifying endometriosis induced infertility based on a marker panel CL at 1/3 under the vagina in the present example, where a is the error rate distribution of 5-fold cross-validation of random forest identification endometriosis (infertility) with increasing number of OTUs, b is the receiver operating curve (abbreviated ROC curve) of the cross-validated combination, the area under the curve (abbreviated AUC) is 0.8272, the shaded area represents the 95% confidence interval, and the diagonal represents the curve with AUC 0.5;
fig. 2 is a graph of the results of identifying endometriosis (infertility) based on a marker set of posterior fornix CU in the examples of the present application, where a is the error rate distribution of 5-fold cross-validation of random forest identification endometriosis (infertility) with increasing number of OTUs, b is the receiver operating curve (abbreviated ROC curve) of the cross-validated combination, the area under the curve (abbreviated AUC) is 0.5919, the shaded area represents the 95% confidence interval, and the diagonal represents the curve with AUC 0.5;
fig. 3 is a graph of the results of identifying endometriosis (infertility) based on cervical CV marker sets in the present example, where a is the error rate distribution of cross-validation of random forest identification endometriosis (infertility) 5 times with increasing number of OTUs, b is the receiver operating curve of the cross-validated combination, the area under the curve is 0.8493, the shaded area represents the 95% confidence interval, and the diagonal represents the curve with AUC of 0.5;
FIG. 4 is a ROC curve for the CL marker set at 1/3 below the vagina in the example of the present application to identify endometriosis (infertility) in the second population;
FIG. 5 is a ROC curve for the identification of endometriosis (infertility) in the second population for the posterior vaginal fornix CU marker set in the present example;
FIG. 6 is a ROC curve for a cervical CV marker panel identifying endometriosis (infertility) in a second population in an embodiment of the present application;
in the figure, the variable number refers to the number of OTUs, wherein the sensitivity is true positive/(true positive + false negative); specificity is true negative/(true negative + false positive).
Detailed Description
The biomarker of the present invention is obtained from the relationship between DNA of the microbial population in the three sites to be collected and endometriosis, and is actually the microbial OTU that can represent the state of endometriosis in these three sites. Specifically, in one preparation method of the present application, the corresponding relationship or the biomarker is obtained by using the relative abundance of the OTU seed sequence as one object and the endometriosis state (diseased or non-diseased) as a second object, fitting the two objects through a random forest model, and finally performing cross validation for 5 times by ten folds. Twenty-four nucleic acids of 14 microorganisms at three sites are finally obtained as the biomarkers of the application through strict calculation and experimental research.
In one implementation manner of the application, the marker sets of the three parts can independently evaluate the endometriosis or the risk of endometriosis, but the accuracy is higher by combining the probabilities of the three parts to judge whether the object to be detected has endometriosis or has the risk of endometriosis.
The terms used herein are intended to have the meanings commonly understood by those of ordinary skill in the art. For a better understanding of the present application, some definitions and related terms are explained as follows:
the term "endometriosis" as used herein, is a common gynaecological disease, defined as a condition in which endometrial tissue grows outside the uterine cavity. The most common endometriosis is found in the ovary and fallopian tubes, and possibly in the myometrium, pelvic peritoneum, and even the bladder and large intestine. Because endometrioma of ovary is ectopic, endometrioma with brown liquid can be formed, so chocolate cyst or chocolate tumor can affect pregnancy.
The level of biomarker substance of the present application is indicated by relative abundance.
In one embodiment of the present application, the reference value refers to a reference value or normal value of a healthy control. It is clear to the person skilled in the art that the range of normal values, i.e. absolute values, for each biomarker can be obtained by testing and calculation methods in case of a sufficient number of samples.
A "biomarker," also referred to as a "biological marker" in the present application, refers to a measurable indicator of a biological state of an individual. Such biomarkers may be any substance in the individual as long as they are associated with a particular biological state of the subject being examined, such as a disease. Such biomarkers can be, for example, nucleic acid markers (e.g., DNA), protein markers, cytokine markers, chemokine markers, carbohydrate markers, antigen markers, antibody markers, species markers (species/genus markers), and functional markers (KO/OG markers), among others. The biomarkers of the present application are specifically DNA nucleic acid markers.
The "OTU" in the present application refers to an operation classification unit (OTU), which is a same mark artificially set for a certain classification unit, such as strain, species, genus, group, etc., for analysis in phylogenetic research or population genetics research. The sequence is divided into one OTU according to a similarity threshold of 97% in the present application, thereby allowing a plurality of OTUs to be obtained from samples of three sites, respectively, each OTU being regarded as one microbial species. Both the microbial diversity in the sample and the abundance of different microorganisms are based on analysis of OTUs.
Reference to "individual" in this application refers to an animal, particularly a mammal, such as a primate, which in the examples of this application is a human.
The present application is described in further detail below with reference to specific embodiments and the attached drawings. The following examples are intended to be illustrative of the present application only and should not be construed as limiting the present application.
Examples
1. Materials and methods
1.1 sample Collection
The sample collection of the example is assisted by the obstetrician of Shenzhen North Hospital. Excluding inflammation cases, study objects are women in non-menstrual period, non-gestation period and non-lactation period, and the study objects have no endocrine and autoimmune diseases and normal liver and kidney functions. No hormones and antibiotics were used for a period of time prior to sampling, no vaginal medication, vaginal lavage and cervical treatment were performed, and no sexual life was performed within 48 hours prior to sampling. In this example, 49 women of child bearing age were selected as the first group according to the above criteria. All individuals who meet the above criteria are registered with detailed phenotypic information to understand their medical history, family history, medication history, lifestyle habits, etc., and are signed with informed consent.
The lower genital tract is sampled from three parts of the vagina, namely 1/3 (abbreviated as CL), fornix vaginae (abbreviated as CU) and cervical canal (abbreviated as CV), in a gynecological examination bed after the individual is admitted and emptied of urine without disinfection treatment. Specifically, seventeen collection objects with sample numbers and sampling information of 49 collection objects, namely numbers of C026, C028, C033, C035, C038, C041, C045, C048, C050, C055, C056, C058, C059, C062, C063, C064 and C065 are patients with non-endometriosis, and seventeen collection objects collect samples of three parts of CL, CU and CV; thirty-two collection subjects with numbers T022, T024, T027, T028, T032, T033, T036, T039, T041, T042, T045, T053, T056, T058, T059, T061, T062, T063, T067, T069, T070, T076, T078, T084, T085, T086, T087, T088, T090, T092, T094, and T095 are endometriosis patients, and thirty-two collection subjects all collected samples of three sites CL, CU, and CV.
The sample collection was performed using nylon flock swabs available from morning and yang global group CY-93050 and CY-98000. After sampling, the swab head is quickly frozen by liquid nitrogen, stored at-80 ℃ and transported to Shenzhen Huada Gene institute by dry ice for subsequent experiments.
1.2 DNA extraction and 16S sequencing
In this example, DNA extraction was carried out using QIAamp DNA Mini Kit (purchased from QIAGEN). The specific extraction step is carried out according to the instruction provided by the manufacturer. 16S rRNA gene V4-V5 hypervariable region specific primers are used for amplification, the two primers are V4-515F and V5-907R respectively, V4-515F is a sequence shown in Seq ID No.25, and V5-907R is a sequence shown in Seq ID No. 26.
Seq ID No.25:5’-GTGCCAGCMGCCGCGGTAA-3’
Seq ID No.26:5’-CCGTCAATTCMTTTRAGT-3’
PCR was performed by denaturation at 94 ℃ for 3 min; then 25 cycles were entered: denaturation at 94 ℃ for 45s, annealing at 50 ℃ for 60s, and extension at 72 ℃ for 90 s; after the circulation, the extension was carried out at 72 ℃ for 10 min. The obtained PCR product is purified by using AMPure Beads (Axygen), and the sequencing adopts a chip lane sequencing method, so that a plurality of samples are mixed and sequenced. Therefore, library construction requires the addition of a linker sequence after ligation of a 10bp barcode sequence at the outer end of the primer sequence of each sample. The different samples are distinguished by the addition of a different barcode sequence, i.e. sample identification sequence, to each sample. After the library is constructed, reverse sequencing of V5-V4 is carried out through an Ion torrent PGM sequencing platform, and the library construction, sequencing and the like are carried out through Shenzhen Shenhuada gene.
1.316S sequencing data processing
Raw data was extracted from PGM systems and preprocessed using the Mothur software (V1.33.3), and criteria for high quality sequences included: 1) the length is more than 200 bp; 2) mismatch of fewer bases with degenerate PCR; 3) the average mass fraction is greater than 25. Based on the 16S rRNA gene sequence, OTUs were clustered by using the uclust method of QIIME, and the similarity threshold was set to 97%. Seed sequences (Seed sequences) of each OTU were selected and annotated with reference gene information gg _13_8_ OTUs in Greengene database. Calculating the relative abundance of each OTU in each sample, wherein the relative abundance of an OTU is the ratio of the abundance of that OTU in a sample to the sum of the abundances of all OTUs in the sample.
1.4 microbial population consistency analysis between samples at different sites
Based on the presence or absence of OTU, this example utilizes the Sorenson index (b) ((r))
Figure BDA0001116596340000082
Dice index) to measure the similarity of the microbiota of samples of different loci of the same individual, calculated as follows:
Figure BDA0001116596340000081
where A and B represent the number of OTUs in samples A and B, respectively, and C represents the number of OTUs shared in both samples. QS is a similarity index, and the value range is 0-1. In this example, the similarity index of CL and CU, the similarity index of CL and CV, and the similarity index of CU and CV are calculated, respectively. The similarity index is approximately close to 1, indicating that the higher the similarity of the microbiota of the two sampling sites.
1.5 random forest classifier
In order to establish a model capable of identifying samples in abnormal states, for each sampling part, a randomForest toolkit in R software (3.1.2 RC) is utilized to fit the relative abundance of OTU of each sample with the endometriosis state, and default parameters are adopted; wherein, the OTU of each sample is the OTU at least existing in 10% of the samples, that is, the OTU which can be detected only in less than 10% of the samples in all the samples to be detected at each part is eliminated. And then carrying out 5 times of 10-fold cross validation, averaging error curves of the 5 times of 10-fold cross validation, and taking the lowest error of the averaged curve plus the standard error of the point as a threshold value of an acceptable error. And in each group of OTU with the classification error smaller than the threshold value, the lowest OTU number is the optimal OTU combination which is used as the biomarker combination for identifying the endometriosis.
1.6 biomarker validation
To verify the biomarkers obtained in this example, the test was additionally performed using an independent test population, i.e., a second population. In the second population, 11 endometriosis patients and 11 non-endometriosis individuals for CL and CU, respectively; for CV, there were 12 endometriosis patients and 10 non-endometriosis individuals.
2. Results of the experiment
2.1 similarity of vaginal and uterine microbiota in the same Individual
In order to explore the relationship between microbiota in different regions of the reproductive tract, the distance between samples of the same individual was calculated. The weighted UniFrac distances from the posterior fornix (CU), cervical Canal (CV) mucus to the uterus and peritoneum increased sequentially relative to the lower vaginal 1/3(CL) sample. This again indicates that the microbiota of the female reproductive tract is present in a continuous manner.
Samples in the same individual showed a high correlation, and the Sorenson index between samples at different sites was consistent with their anatomical coupling. Even the correlation of cervical mucus or peritoneal fluid is quite evident, with an average Sorenson index of 0.255, suggesting that minimally invasive detection of uterine and peritoneal microbiota in the general population is possible.
In addition, this example also samples the endometrium directly through the external cervical os and samples the cervical mucus from the uterine tip. The distribution of bacteria in samples taken through the external os of the cervix showed a high degree of similarity to that of samples taken from the uterus during surgery, further indicating that uterine microorganisms can be readily obtained and analyzed.
2.2 microorganisms associated with diseases
In order to obtain OTU biomarkers for identifying endometriosis, the present example establishes a random forest model, comprising the specific steps of: (1) designing a random forest model based on a first population by taking the relative abundance of the OTU as an input characteristic; (2) for the random forest model, a 10-fold cross validation algorithm is designed, the first population is divided into an endometriosis individual and a non-endometriosis individual, ROC curves of the random forest model are obtained respectively, and AUC values of areas under the ROC curves are used as evaluation indexes.
In this example, random forest models were used in combination with 10-fold cross validation to obtain optimal biomarkers for each site, as shown in table 1, for identifying endometriosis. Tables 2 to 4 are the enrichment information of the marker sets of the three sites in the sample, respectively, and tables 5 to 7 are the relative abundance information of the marker sets of the three sites in the sample of the first population, respectively. In this example, the results of identifying endometriosis were obtained for the biomarkers at three sites, as shown in fig. 1 to 3, fig. 1 shows the marker panel at 1/3(CL) under the vagina identifying endometriosis, fig. 2 shows the marker panel at posterior fornix (CU) of the vagina identifying endometriosis, and fig. 3 shows the marker panel at cervical Canal (CV) identifying endometriosis.
TABLE 1 biomarkers and their respective sites
Seq ID No. OTU numbering OTU Classification CL CU CV
1 1 Lactobacillus sp. --
2 5 Leptotrichiaceae -- --
3 11 Vagococcus sp. -- --
4 12 Delftia sp. -- --
5 27 Dysgonomonas sp. -- --
6 33 Aerococcus sp. -- --
7 35 Prevotella sp. --
8 38 Lactobacillus sp. -- --
9 42 Tissierellaceae -- --
10 48 Comamonadaceae -- --
11 54 Erysipelotrichaceae -- --
12 61 Lactobacillus sp. --
13 64 Dialister sp. -- --
14 70 Erysipelothrix sp. -- --
15 86 Anaerococcus sp. -- --
16 108 Dysgonomonas sp. -- --
17 221 Lactobacillus sp. -- --
18 233 Lactobacillus sp. -- --
19 344 Lactobacillus sp. -- --
20 424 Lactobacillus iners -- --
21 464 Prevotella sp. -- --
22 520 Lactobacillus iners -- --
23 628 Lactobacillus iners -- --
24 663 Lactobacillus sp. -- --
In table 1, markers at three sites, CL, CU, and CV, can be individually determined, where "√" indicates a biomarker that is required for determining the site, and "-" indicates an unnecessary biomarker.
When the sample is detected, the relative abundance of the OTU of the square root of Chinese character check is calculated at each part, the relative abundance is input into a random forest model, the result is obtained, and whether the endometriosis is detected or not is judged.
Table 2 CL information on abundance of each OTU in marker set
Figure BDA0001116596340000101
Figure BDA0001116596340000111
TABLE 3 abundance information of each OTU of marker groups in CU
Figure BDA0001116596340000112
Table 4 CV tag group OTU abundance information
Figure BDA0001116596340000113
In tables 2 to 4, the endometriosis group refers to a sample having endometriosis in 49 collected subjects of the first population, and the control group refers to a sample having no endometriosis in 49 collected subjects of the first population.
Information on the abundance of each OTU of the marker set in Table 5 CL in the first population
Figure BDA0001116596340000114
Figure BDA0001116596340000121
Figure BDA0001116596340000131
TABLE 6 abundance information in first population for each OTU of marker set in CU
Figure BDA0001116596340000132
Table 7 abundance information in CV for each OTU of the marker set in the first population
Figure BDA0001116596340000133
Figure BDA0001116596340000141
FIG. 1 is a graph of the error rate distribution of 5-fold 10-fold cross-validation of random forest identification endometriosis at 1/3 under the vagina (CL), wherein a is the distribution of error rates for random forest identification endometriosis with increasing amounts of OTU, the model is trained on the relative abundance of OTU in the sample, 17 non-endometriosis individuals and 32 endometriosis individual CL samples are used in total, the black lines represent the average of 5 trials, the gray lines represent 5 trials, respectively, and the black vertical lines represent the number of OTU in the best combination; the b plot is the receiver operating curve for the cross-validated combination, with area under the curve, AUC, 0.8272, shaded area representing the 95% confidence interval, and the diagonal representing the curve with AUC of 0.5.
FIG. 2 is a diagram of the marker set for posterior fornix (CU) of vagina identifying endometriosis, wherein a is a diagram of error rate distribution of 5-fold cross validation of random forest identification endometriosis for 10 times with the increase of the number of OTUs, the model is trained by the relative abundance of OTUs in samples, CU samples of 17 non-endometriosis individuals and 32 endometriosis individuals are used in total, black lines represent the average value of 5 tests, gray lines are 5 tests respectively, and black vertical lines represent the number of OTUs in the optimal combination; the b plot is the receiver operating curve for the cross-validated combination, with area under the curve, AUC, 0.5919, shaded area representing the 95% confidence interval, and the diagonal representing the curve with AUC of 0.5.
FIG. 3 is a graph of the marker set identification endometriosis of the cervical Canal (CV), wherein a is the error rate distribution of 5-fold cross validation of random forest identification endometriosis for 5 times with the increase of the number of OTUs, the model is trained by the relative abundance of OTUs in samples, CV samples of 17 non-endometriosis individuals and 32 endometriosis individuals are used in total, the black lines represent the average value of 5 tests, the gray lines are 5 tests respectively, and the black vertical lines represent the number of OTUs in the optimal combination; panel b is the receiver operating curve for the cross-validated combination, with an area under the curve, AUC, of 0.8493, shaded area representing the 95% confidence interval, and the diagonal representing the curve with an AUC of 0.5.
As can be seen from the results of fig. 1 to 3, the OTU biomarker sets at three different sites are capable of identifying endometriosis individuals and non-endometriosis individuals; the AUC values of the area under the curve of ROC are 0.8272(CL), 0.5919(CU) and 0.8493(CV), respectively. Where AUC is the area under the curve, the larger the value is, i.e. the closer to 1, the stronger the judgment ability is, i.e. the more accurate the judgment is.
2.3 biomarker validation
OTU biomarkers from random forests were validated in the second population samples and the results are shown in table 8, table 9 and table 10. In tables 8 to 10, sample numbers C003CL, C003CU, and C003CV represent samples collected from three sites of CL, CU, and CV of the same C003 sampling object, respectively. Tables 8 to 10 predict the probability of an individual having endometriosis for the three marker sets, and the resulting ROC curves are in the order of fig. 4 to 6. In tables 8 to 10, the probability >0.5 is considered that the individual is judged to be at risk of or suffering from endometriosis by the marker panel for that site.
TABLE 8 CL-site CL marker panel predicts probability of second population samples having endometriosis
Figure BDA0001116596340000151
Figure BDA0001116596340000161
TABLE 9 CU marker panel of CU sites the probability of a second population of samples having endometriosis is predicted
Sample numbering Whether or not endometriosis (infertility) is present (N: No; Y is) Probability of
C001CU N 0.285
C003CU N 0.106
C004CU N 0.369
C005CU N 0.443
C007CU N 0.298
C008CU N 0.588
C009CU N 0.675
C012CU N 0.988
C014CU N 0.121
C019CU N 0.188
C021CU N 0.584
T001CU Y 0.943
T003CU Y 0.514
T005CU Y 0.969
T006CU Y 0.885
T000CU Y 0.475
T008CU Y 0.135
T015CU Y 0.838
T017CU Y 0.943
T081CU Y 0.722
T082CU Y 0.943
T083CU Y 0.992
TABLE 10 CV marker panel for CV site predicting probability of second population sample having endometriosis
Figure BDA0001116596340000162
Figure BDA0001116596340000171
The results in fig. 4 show that the CL site judges the probability of endometriosis based on the CL marker set, and the AUC value thereof is 0.8750; the results of fig. 5 show that the CU site judges the endometriosis probability based on the CU marker set, and its AUC value is 0.840; the results in fig. 6 show that the CV site judges the probability of endometriosis based on the CV marker panel, and its AUC value is 0.9189; it can be seen that these three marker sets have high discriminatory power and can be used for the detection of endometriosis, and the results are in accordance with those in tables 8 to 10. Of the results in tables 8 to 10, the probabilities predicted by the three marker sets, at least one of which is greater than 0.5, are judged as being at risk of developing endometriosis or as being at risk of developing endometriosis, and the judgment results thus obtained are in accordance with the actual situation.
The foregoing is a more detailed description of the present application in connection with specific embodiments thereof, and it is not intended that the present application be limited to the specific embodiments thereof. For those skilled in the art to which the present application pertains, several simple deductions or substitutions may be made without departing from the concept of the present application, and all should be considered as belonging to the protection scope of the present application.
SEQUENCE LISTING
<110> Shenzhen Hua Dagene institute
<120> biomarker combination for endometriosis detection and application thereof
<130> 16I23216
<160> 26
<170> PatentIn version 3.3
<210> 1
<211> 214
<212> DNA
<213> Lactobacillus crispatus
<400> 1
atgcgtagat atatggaaga acaccagtgg cgaaggcggc tctctggtct gcaactgacg 60
ctgaggctcg aaagcatggg tagcgaacag gattagatac cctggtagtc catgccgtaa 120
acgatgagtg ctaagtgttg ggaggtttcc gcctctcagt gctgcagcta acgcattaag 180
cactccgcct ggggagtacg accgcaaggt tgaa 214
<210> 2
<211> 211
<212> DNA
<213> Leptotrichiaceae
<400> 2
attcgtagat atgtgtagga atgccgatga tgaagataac tcactggaca gaaactgacg 60
ctgaagtgcg aaagctaggg gagcaaacag gattagatac cctggtagtc ctagctgtaa 120
acgatgatca ctgggtgtgg ggatgcgaag tctctgtgcc gaagcaaaag cgataagtga 180
tccgcctggg gagtacgttc gcaagaatga a 211
<210> 3
<211> 214
<212> DNA
<213> Vagococcus sp.
<400> 3
atgcgtagat atatggagga acaccagtgg cgaaggcgac tctctggtct gtaactgaca 60
ctgaggctcg aaagcgtggg gagcaaacag gattagatac cctggtagtc cacgccgtaa 120
acgatgagtg ctaagtgttg gagggtttcc gcccttcagt gctgcagtta acgcattaag 180
cactccgcct ggggagtacg gtcgcaagac tgaa 214
<210> 4
<211> 212
<212> DNA
<213> Delftia sp.
<400> 4
atgcgtagat atgcggagga acaccgatgg cgaaggcaat cccctggacc tgtactgacg 60
ctcatgcacg aaagcgtggg gagcaaacag gattagatac cctggtagtc cacgccctaa 120
acgatgtcaa ctggttgttg ggaattagtt ttctcagtaa cgaagctaac gcgtgaagtt 180
gaccgcctgg ggagtacggc cgcaaggttg aa 212
<210> 5
<211> 209
<212> DNA
<213> Dysgonomonas sp.
<400> 5
atgcatagat ataacgagga actccaattg cgtaggcagc ttactaaact ataactgacg 60
ctcatgcacg aaagtgtggg tatcaaacag gattagatac ctggtagtcc acacagtaaa 120
cgatgattac tagttttttg cgatataccg taagagacta agcgaaagcg ataagtaatc 180
cacctgggga gtacgttggc aacaatgaa 209
<210> 6
<211> 214
<212> DNA
<213> Aerococcus sp.
<400> 6
atgcgtagat atatggaaga acaccagtgg cgaaagcgac tttctggtct gtcactgacg 60
ctgaggcccg aaagcgtggg tagcaaacag gattagatac cctggtagtc cacgccgtaa 120
acgatgagcg ctaggtgttg gagggtttcc acccttcagt gccgcagcta acgcattaag 180
cgctccgcct ggggagtacg accgcaaggt tgaa 214
<210> 7
<211> 211
<212> DNA
<213> Prevotella sp.
<400> 7
atgcttagat atcatgagga actccgattg cgaaggcagc ttgctgcagt gcgactgacg 60
cttaggctcg aaggtgcggg tatcaaacag gattagatac cctggtagtc cgcacggtaa 120
acgatggatg cccgctgtcc gcccattcgt ggcgggcggc caagcgaaag cgttaagcat 180
cccacctggg gagtacgccg gcaacggtga a 211
<210> 8
<211> 215
<212> DNA
<213> Lactobacillus sp.
<400> 8
atgcgtagat atatggaaga acaccagtgg cgaaggcggc tctctggtct gtaactgacg 60
ctgaggctcg aaagcatggg gtagcgaaca ggattagata ccctggtagt ccatgccgta 120
aacgatgagt gctaagtgtt gggaggtttc cgcctctcag tgctgcagct aacgcattaa 180
gcactccgcc tggggagtac gaccgcaagg ttgaa 215
<210> 9
<211> 210
<212> DNA
<213> Tissierellaceae
<400> 9
atgcgtagat attaggagga ataccagtgg cgaaggcgac tttctggact tatactgaca 60
ctgaggaacg aaagcgtggg gagcaaacag gattagatac cctggtagtc cacgccgtaa 120
acgatgagtg ctaggtgttg ggggtcaaac ctcggtgccg cagctaacgc attaagcact 180
ccgcctgggg agtacgtacg caagtatgaa 210
<210> 10
<211> 212
<212> DNA
<213> Comamonadaceae
<400> 10
atgcgtagat atgcggagga acaccgatgg cgaaggcaac cccctgggcc tgtactgacg 60
ctcatgcacg aaagcgtggg gagcaaacag gattagatac cctggtagtc cacgccctaa 120
acgatgtcaa ctggttgttg ggtcttaact gactcagtaa cgaagctaac gcgtgaagtt 180
gaccgcctgg ggagtacggc cgcaaggttg aa 212
<210> 11
<211> 206
<212> DNA
<213> Erysipelotrichaceae
<400> 11
atgcgtagat atatggagga acaccagtgg cgaaggcgac tttctggtct gtaactgacg 60
ctgaggcacg aaagcgtggg gagcaaatag gattagatac cctagtagtc cacgccgtaa 120
acgatgagaa ctaagtgttg gggcaactca gtgctgaagc aaacgcatta agttctccgc 180
ctggggagta tgctcgcaag agtgaa 206
<210> 12
<211> 211
<212> DNA
<213> Lactobacillus sp.
<400> 12
atgcgtagat atatggagaa caccagtggc gaggcggctc tctggtctgc aactgacgct 60
gaggctcgaa gcatgggtag cgaacaggat tagataccct ggtagtccat gccgtaaacg 120
atgagtgcta agtgttggga ggtttccgcc tctcagtgct gcagctaacg cattaagcac 180
tccgcctggg gagtacgacc gcaaggttga a 211
<210> 13
<211> 214
<212> DNA
<213> Dialister sp.
<400> 13
atgcgtagat attaggaaga acaccggtgg cgaaggcgac tttctggacg aaaactgacg 60
ctgaggcgcg aaagcgtggg gagcaaacag gattagatac cctggtagtc cacgccgtaa 120
acgatggata ctaggtgtag gaggtatcga ccccttctgt gccggagtta acgcaataag 180
tatcccgcct gggaagtacg atcgcaagat taaa 214
<210> 14
<211> 207
<212> DNA
<213> Erysipelothrix sp.
<400> 14
atgcgtagat atatggagga acaccagtgg cgaaggcggc tcactggcct gttactgacg 60
ctgaggctcg aaagcgtggg gagcaaatag gattagatac cctagtagtc cacgccgtaa 120
acgatggata ctaagtgttg gtgaaaaatc agtgctgtag ttaacgcaat aagtatcccg 180
cctggggagt atgcgcgcaa gcgtcaa 207
<210> 15
<211> 208
<212> DNA
<213> Anaerococcus sp.
<400> 15
atgcgcagat attaggagga ataccggtgg cgaaggcgac tttctggcca taaactgacg 60
ctgaggtacg aaagcgtggg tagcaaacag gattagatac cctggtagtc cacgccgtaa 120
acgatgagtg ttaggtgtct ggaataatct gggtgccgca gctaacgcaa taaacactcc 180
gcctggggag tacgcacgca agtgtgaa 208
<210> 16
<211> 210
<212> DNA
<213> Dysgonomonas sp.
<400> 16
atgcatagat ataacgagga actccaattg cgtaggcagc ttactaaact ataactgacg 60
ctcatgcacg aaagtgtggg tatcaaacag gattagatac cctggtgtcc acacagtaaa 120
cgatgattac tagttttttt gcgatatacc gtaagagact aagcgaaagc gataagtaat 180
ccacctgggg agtacgttgg caacaatgaa 210
<210> 17
<211> 214
<212> DNA
<213> Lactobacillus sp.
<400> 17
atgcgtagat atatggaaga acaccagtgg cgaaggcggc tctctggtct gtaactgacg 60
ctgaggctcg aagacatggg tagcgaacag gattagatac cctggtagtc catgccgtaa 120
acgatgagtg ctaagtgttg ggaggtttcc gcctctcagt gctgcagcta acgcattaag 180
cactccgcct ggggagtacg accgcaaggt tgaa 214
<210> 18
<211> 214
<212> DNA
<213> Lactobacillus sp.
<400> 18
atgcgtagat atatggaaga acaccagtgg cgaaggcggc tctctggtct gcaactgacg 60
ctgaggctcg aaagcatggg tagcgaacag gattagatac cctggtagtc catgccgtaa 120
acgatgagtg ctaagtgttg ggaggtttcc gcctctcagt gctgcagcta acgcattaag 180
cactccgcct ggggagtacg accgcaaggg ttga 214
<210> 19
<211> 215
<212> DNA
<213> Lactobacillus sp.
<400> 19
atgcgtagat atatggaaga acaccagtgg cgaaggcggc tctctggtct gcaactgacg 60
ctgaggctcg aaagcatggg tagtgaacag gattagatac cctggtagtc catgccgtaa 120
acgatgagtg ctaagtgttg ggaggtttcc gcctctcagt gctgcagcta acgcattaag 180
cactccgcct ggggaggtac gaccgcaagg ttgaa 215
<210> 20
<211> 213
<212> DNA
<213> Lactobacillus iners
<400> 20
atgcgtagat atatggaaga acaccggtgg cgaaggcggc tctctggtct gttactgacg 60
ctgaggctac gaacatgggt agcgaacagg attagatacc ctggtagtcc atgccgtaaa 120
cgatgagtgc taagtgttgg gaggtttccg cctctcagtg ctgcagctaa cgcattaagc 180
actccgcctg gggagtacga ccgcaaggtt gaa 213
<210> 21
<211> 209
<212> DNA
<213> Prevotella sp.
<400> 21
atgcttagat atcatgacga actccgattg cgcaggcagc ttactgtagc ataactgacg 60
ctgatgctcg aagtgcgggt atcaaacagg attagatacc tggtagtccg cacggtaaac 120
gatggatgct cgctattcgt cctatttgga tgagtggcca agtgaaaaca ttaagcatcc 180
cacctgggga gtacgccggc aacggtgag 209
<210> 22
<211> 213
<212> DNA
<213> Lactobacillus iners
<400> 22
atgcgtagat atatggaaga caccgggtgg cgaggcggct ctctggtctg ttactgacgc 60
tgaggctcga aagcatgggt agcgaacagg attagatacc ctggtagtcc atgccgtaaa 120
cgatgagtgc taagtgttgg gaggtttccg cctctcagtg ctgcagctaa cgcattaagc 180
actccgcctg gggagtacga ccgcaaggtt gaa 213
<210> 23
<211> 213
<212> DNA
<213> Lactobacillus iners
<400> 23
atgcgtagat atatggaaga acaccggtgg cgaaggcggc tctctggtct gttactgacg 60
ctgaggctcg aaagcatggg tagcgaacag gattagatac cctggtagtc catgccgtaa 120
tgatgagtgc taagtgttgg gaggtttccg cctctcagtg ctgcagctaa cgcattaagc 180
actccgtctg gggagtacga ccgcaaggtt gaa 213
<210> 24
<211> 213
<212> DNA
<213> Lactobacillus sp.
<400> 24
atgcgtagat atatggaaga acaccagtgg cgaaggcggc tctctggtct gcaactgacg 60
ctgaggctcg aaagcatggg tagcgaacag gattagatac cctggtagtc catgccgtaa 120
acgatgagtg ctaagtgttg ggaggtttcc gcctctcagt gctgcagcat acgcataagc 180
actccgcctg gggagtacga ccgcaaggtt gaa 213
<210> 25
<211> 19
<212> DNA
<213> Artificial sequence
<400> 25
gtgccagcmg ccgcggtaa 19
<210> 26
<211> 18
<212> DNA
<213> Artificial sequence
<400> 26
ccgtcaattc mtttragt 18

Claims (7)

1. A biomarker combination for use in the detection of endometriosis or the assessment of risk of developing endometriosis, characterized in that: the biomarker combination comprises twenty-four nucleic acids which are shown as Seq ID No.1 to Seq ID No.24 respectively.
2. A biomarker combination for use in the detection of endometriosis or the assessment of risk of developing endometriosis, characterized in that: the biomarker panel comprises a first marker panel, a second marker panel, and a third marker panel;
the first marker group consists of fourteen nucleic acids, and the fourteen nucleic acids are respectively shown as Seq ID No.1, Seq ID No.2, Seq ID No.6, Seq ID No.7, Seq ID No.12, Seq ID No.13, Seq ID No.15, Seq ID No. 17-22 and Seq ID No. 24;
the second marker group consists of two nucleic acids, wherein the two nucleic acids are sequences shown as Seq ID No.1 and Seq ID No.7 respectively;
the third marker group consists of eleven nucleic acids, and the eleven nucleic acids are respectively shown as Seq ID No.3 to Seq ID No.5, Seq ID No.8 to Seq ID No.12, Seq ID No.14, Seq ID No.16 and Seq ID No. 23.
3. The biomarker combination according to claim 2, characterized in that: the first marker set is a CL marker set for endometriosis detection or risk assessment of endometriosis for a sample from subinvolus 1/3.
4. The biomarker combination according to claim 2, characterized in that: the second marker set is a CU marker set used for endometriosis detection or risk assessment of a sample from the posterior fornix of the vagina.
5. The biomarker combination according to claim 2, characterized in that: the third marker set is a CV marker set used for detecting endometriosis or evaluating the disease risk of a sample from a cervical canal.
6. A kit for detecting endometriosis or assessing risk of endometriosis, comprising: the kit comprises a primer pair for detecting the biomarker combination as defined in any one of claims 1 to 5, wherein a forward primer of the primer pair is represented by SEQ ID No.25, and a reverse primer of the primer pair is represented by SEQ ID No. 26.
7. Use of a biomarker combination according to any of claims 1 to 5 in the manufacture of a kit or a detection tool for the detection of endometriosis or for the assessment of risk of disease.
CN201610831390.6A 2016-09-19 2016-09-19 Biomarker combination for detecting endometriosis and application thereof Active CN107858416B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201610831390.6A CN107858416B (en) 2016-09-19 2016-09-19 Biomarker combination for detecting endometriosis and application thereof
CN201780047955.4A CN109715828B (en) 2016-09-19 2017-08-07 Biomarker combination for detecting endometriosis and application thereof
PCT/CN2017/096249 WO2018049947A1 (en) 2016-09-19 2017-08-07 Biomarker composition for detection of endometriosis and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610831390.6A CN107858416B (en) 2016-09-19 2016-09-19 Biomarker combination for detecting endometriosis and application thereof

Publications (2)

Publication Number Publication Date
CN107858416A CN107858416A (en) 2018-03-30
CN107858416B true CN107858416B (en) 2021-05-28

Family

ID=61618620

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201610831390.6A Active CN107858416B (en) 2016-09-19 2016-09-19 Biomarker combination for detecting endometriosis and application thereof
CN201780047955.4A Active CN109715828B (en) 2016-09-19 2017-08-07 Biomarker combination for detecting endometriosis and application thereof

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201780047955.4A Active CN109715828B (en) 2016-09-19 2017-08-07 Biomarker combination for detecting endometriosis and application thereof

Country Status (2)

Country Link
CN (2) CN107858416B (en)
WO (1) WO2018049947A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114144674A (en) * 2019-07-22 2022-03-04 豪夫迈·罗氏有限公司 Substance P as a blood biomarker for the non-invasive diagnosis of endometriosis
CN112941169B (en) * 2021-03-24 2022-05-31 浙江大学医学院附属邵逸夫医院 Method for detecting internal abnormal disease related IVF fate based on granular cell gene expression
KR20230105080A (en) * 2022-01-03 2023-07-11 연세대학교 산학협력단 Method for providing information for diagnosis of endometriosis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101210929A (en) * 2006-12-29 2008-07-02 中国医学科学院北京协和医院 Method for detecting endometriosis blood plasma marker protein
CN101334410A (en) * 2007-06-26 2008-12-31 沙桂华 Method and reagent kit for detecting endometriosis
CN105907879A (en) * 2016-06-29 2016-08-31 北京泱深生物信息技术有限公司 Endometrial cancer biological marker

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060105405A1 (en) * 2004-10-20 2006-05-18 Onco Detectors International, Llc Migration inhibitory factor in serum as a tumor marker for prostate, bladder, breast, ovarian, kidney, pancreatic and lung cancer and for diagnosis of endometriosis
US20070287676A1 (en) * 2006-05-16 2007-12-13 Sun-Wei Guo Diagnosis and treatment of endometriosis
US7871778B2 (en) * 2007-04-25 2011-01-18 The Regents Of The University Of California Methods of diagnosing endometriosis
AU2009222056A1 (en) * 2008-03-01 2009-09-11 Abraxis Bioscience, Llc Treatment, diagnostic, and method for discovering antagonist using SPARC specific miRNAs
EP2561351B1 (en) * 2010-04-22 2015-12-09 British Columbia Cancer Agency Branch Novel biomarkers and targets for ovarian carcinoma
SG190719A1 (en) * 2010-11-29 2013-07-31 Meiji Co Ltd Endometriosis prevention and/or improving agent, and food or drink composition containing same
GB201115095D0 (en) * 2011-09-01 2011-10-19 Singapore Volition Pte Ltd Method for detecting nucleosomes containing nucleotides
WO2013148151A1 (en) * 2012-03-29 2013-10-03 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Plasma microribonucleic acids as biomarkers for endometriosis and endometriosis-associated ovarian cancer
US9434991B2 (en) * 2013-03-07 2016-09-06 Juneau Biosciences, LLC. Method of testing for endometriosis and treatment therefor
GB201403489D0 (en) * 2014-02-27 2014-04-16 Univ London Queen Mary Biomarkers for endometriosis
TWI517852B (en) * 2014-04-15 2016-01-21 高雄醫學大學 A use of mir-199a-5p
CN105543385A (en) * 2016-01-29 2016-05-04 山东大学齐鲁医院 Irritable bowel syndrome microbial marker and application thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101210929A (en) * 2006-12-29 2008-07-02 中国医学科学院北京协和医院 Method for detecting endometriosis blood plasma marker protein
CN101334410A (en) * 2007-06-26 2008-12-31 沙桂华 Method and reagent kit for detecting endometriosis
CN105907879A (en) * 2016-06-29 2016-08-31 北京泱深生物信息技术有限公司 Endometrial cancer biological marker

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Importin 13在子宫内膜异位症和子宫内膜癌中的表达及意义;龙行涛等;《中国癌症杂志》;20121231;第22卷(第1期);10-14 *

Also Published As

Publication number Publication date
CN107858416A (en) 2018-03-30
WO2018049947A1 (en) 2018-03-22
CN109715828B (en) 2022-07-19
CN109715828A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
Winters et al. Does the endometrial cavity have a molecular microbial signature?
EP3416653B1 (en) Method and system for early risk assessment of preterm delivery outcome
CN105671181B (en) Gene marker, primer, probe and kit for detecting lung cancer
CN109715828B (en) Biomarker combination for detecting endometriosis and application thereof
CN106574294A (en) Method for diagnosing colorectal cancer from human feces sample by quantitive pcr, primers and kit
KR102067607B1 (en) Application of Y Chromosome Methylation Site as a Prognostic Cancer Diagnostic Marker
CN101878426A (en) Multiple analyte diagnostic readout
CN111863250B (en) Combined diagnosis model and system for early breast cancer
CN114277143B (en) Application of exosomes ARPC5, CDA and the like in lung cancer diagnosis
CN109689890B (en) Biomarker combination for adenomyosis detection and application thereof
TWI571514B (en) Method for accessing the risk of having colorectal cancer
CN113724862A (en) Colorectal cancer biomarker and screening method and application thereof
US20210310078A1 (en) Method for early diagnosis of breast cancer and monitoring after treatment using liquid biopsy multi-cancer gene biomarkers
CN106987633A (en) A kind of primer and kit for detecting colorectal cancer serum secretion type lncRNAs
CN112384634B (en) Osteoporosis biomarker and application thereof
CN114480636B (en) Application of bile bacteria as diagnosis and prognosis marker of hepatic portal bile duct cancer
CN109609639B (en) Colorectal cancer detection method and system
CN109762900B (en) Colorectal cancer marker and application thereof
WO2016049927A1 (en) Biomarkers for obesity related diseases
CN113122640A (en) Use of DNA copy number variation of CEP63 and FOSL2 in diagnosis of urothelial carcinoma of bladder
RU2763707C1 (en) Method for predicting premature rupture of membranes in the period from 22 to 28 weeks of gestation
WO2023063049A1 (en) Method for creating biomarker set for detecting cancer
WO2023073074A1 (en) Jup biomarker for the diagnosis of diseases or disorders of the female reproductive tract
WO2024088538A1 (en) Biomarkers for the diagnosis of diseases or disorders of the female reproductive tract
WO2016049917A1 (en) Biomarkers for obesity related diseases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Beishan Industrial Zone Building in Yantian District of Shenzhen city of Guangdong Province in 518083

Applicant after: BGI SHENZHEN

Address before: Beishan Industrial Zone Building in Yantian District of Shenzhen city of Guangdong Province in 518083

Applicant before: BGI SHENZHEN

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant