CN116631510B - Device for differential diagnosis of Crohn's disease and ulcerative colitis - Google Patents

Device for differential diagnosis of Crohn's disease and ulcerative colitis Download PDF

Info

Publication number
CN116631510B
CN116631510B CN202310559017.XA CN202310559017A CN116631510B CN 116631510 B CN116631510 B CN 116631510B CN 202310559017 A CN202310559017 A CN 202310559017A CN 116631510 B CN116631510 B CN 116631510B
Authority
CN
China
Prior art keywords
sample
ulcerative colitis
gene
mmps
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310559017.XA
Other languages
Chinese (zh)
Other versions
CN116631510A (en
Inventor
邓江
张艳宇
赵宁
吕丽萍
马平
张阳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Academy of Military Medical Sciences AMMS of PLA
Original Assignee
Academy of Military Medical Sciences AMMS of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Academy of Military Medical Sciences AMMS of PLA filed Critical Academy of Military Medical Sciences AMMS of PLA
Publication of CN116631510A publication Critical patent/CN116631510A/en
Application granted granted Critical
Publication of CN116631510B publication Critical patent/CN116631510B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Evolutionary Biology (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Immunology (AREA)
  • Bioethics (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Physiology (AREA)
  • Artificial Intelligence (AREA)
  • Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a device for assisting in judging Crohn disease and ulcerative colitis, which comprises parameter acquisition equipment and a readable carrier; the parameter acquisition device comprises a device for acquiring various parameters involved in the readable carrier; p is recorded on the readable carrier UC = exp (MMPs Scores)/(1+exp (MMPs score)) (1); wherein P is UC The probability of the sample to be tested being predicted as ulcerative colitis; when P UC And when the sample to be tested is less than 0.5, the sample to be tested is Crohn disease. The model constructed in the device of the invention gives up the specific expression value of the MMPs related gene sets, but is based on the binary variable converted by the MMPs related gene sets, thereby better overcoming the problem of batch difference of different chip detection platform sources and having higher clinical use value.

Description

Device for differential diagnosis of Crohn's disease and ulcerative colitis
Technical Field
The invention relates to a device for differential diagnosis of Crohn disease and ulcerative colitis based on a binary variable construction model of patient intestinal mucosa gene expression, belonging to the field of biomedical treatment.
Background
Inflammatory bowel disease (inflammatory bowel disease, IBD) causes chronic intestinal inflammation and is associated with significant morbidity as a result of cross-action of genetic and environmental factors affecting immune responses. Crohn's Disease (CD) and ulcerative colitis (Ulcerative colitis, UC) are two major inflammatory bowel diseases. Although CD and UC share some common pathological and clinical characteristics, they differ somewhat, indicating that they are two different disease types. CD is characterized by ulcer rupture and submucosal fibrosis, granulomatous inflammation and submucosal fibrosis. However, the histological findings characteristic of UC are rectal crypt deformation, lymphocyte infiltration, and chronic inflammation, often limited to the lamina propria. Clinically, differential diagnosis of IBD is usually determined by comprehensive assessment of clinical manifestations and endoscopic, histopathological, radiological and laboratory examination results.
Currently, differential diagnosis between CD and UC in IBD colitis patients is critical for a tailored treatment plan, since 2 diseases face different treatments and response mechanisms after diagnosis. However, differential diagnosis of these subtypes remains a significant clinical challenge, as currently there is no single diagnostic gold standard for UC and CD. According to the disclosure, about 5% to 15% of patients do not meet the stringent criteria for UC or CD, and up to 14% of patients experience at least one change in diagnosis of UC or CD. Thus, diagnosis of IBD, particularly when inflammatory lesions are limited to patients of the colon, is still difficult with current methods.
Disclosure of Invention
The invention aims to provide a device and a method for assisting in judging Crohn disease and/or ulcerative colitis.
The invention provides a kit for assisting in judging Crohn's disease and/or ulcerative colitis, which comprises parameter acquisition equipment and a readable carrier;
the parameter acquisition device comprises a device for acquiring various parameters involved in the readable carrier;
the readable carrier has recorded thereon the following formulas (1) - (3),
P UC =exp(MMPs Scores)/(1+exp(MMPs Scores)) (1)
MMPs Scores=-1.3813+[ANXA1×(0.6358)]+[CXCL13×(0.1000)]+[MMP1×(0.2507)]+[CXCL1×(0.4478)](2)
P UC +P CD =1 (3);
wherein P is UC The probability of the sample to be tested being predicted as ulcerative colitis; p (P) CD The probability of being predicted as Crohn's disease for the case under test; ANXA1, CXCL13, MMP1, CXCL1 are binary variables of the ANXA1, CXCL13, MMP1, CXCL1 genes, respectively; if the expression value of the gene in the sample to be tested is larger than the median value of the expression value of the gene in the ulcerative colitis sample, the binary variable of the gene is assigned to be 1; otherwise, the binary variable of the gene is assigned a value of 0;
when P UC When the sample to be detected is more than 0.5, the sample to be detected is ulcerative colitis; when P UC And when the sample to be tested is less than 0.5, the sample to be tested is Crohn disease.
The parameter acquisition equipment is a device for detecting the expression quantity of ANXA1, CXCL13, MMP1 and CXCL1 genes in a sample to be detected.
Wherein the kit further comprises recording means and/or calculating means; the recording means comprises a pen and/or a computer; the computing means comprises a calculator and/or the computer.
Wherein the readable carrier is a kit instruction; the content of formula I is printed on a card.
Wherein the readable carrier is a computer readable carrier.
The median value of the expression values of the genes in the ulcerative colitis samples is obtained by detecting the expression amounts of the genes by using the same detection device for at least 10 ulcerative colitis samples, and the average value of the expression amounts of the ulcerative colitis samples is the median value of the expression values in the ulcerative colitis samples.
The invention also provides a kit for assisting in judging the Crohn's disease and/or ulcerative colitis, which comprises a device for detecting the expression level of ANXA1, a device for detecting the expression level of CXCL13, a device for detecting the expression level of MMP1, a device for detecting the expression level of CXCL1 and a computing device provided with a parameter operation module; the parameter operation module can perform operations of the following formulas (1) - (3):
P UC =exp(MMPs Scores)/(1+exp(MMPs Scores)) (1);
MMPs Scores=-1.3813+[ANXA1×(0.6358)]+[CXCL13×(0.1000)]+[MMP1×(0.2507)]+[CXCL1×(0.4478)](2);
P UC +P CD =1 (3);
wherein P is UC The probability of the sample to be tested being predicted as ulcerative colitis; p (P) CD The probability of being predicted as Crohn's disease for the case under test; ANXA1, CXCL13, MMP1, CXCL1 are binary variables of the ANXA1, CXCL13, MMP1, CXCL1 genes, respectively; if the expression value of the gene in the sample to be detected is larger than the median value of the expression value of the gene in the sample, the binary variable of the gene is assigned to be 1; otherwise, the binary variable of the gene is assigned a value of 0;
when P UC When the sample to be detected is more than 0.5, the sample to be detected is ulcerative colitis; when P UC And when the sample to be tested is less than 0.5, the sample to be tested is Crohn disease.
The use of a system for detecting the expression levels of the ANXA1, CXCL13, MMP1 and CXCL1 genes in the preparation of products for the determination of crohn's disease and ulcerative colitis should also be within the scope of the present invention.
Wherein the system for detecting the expression levels of the ANXA1, CXCL13, MMP1 and CXCL1 genes is (Affymetrix Human Gene 1.0.0 ST Array/Affymetrix Human Genome U Plus 2.0Array/Agilent-014850Whole Human Genome Microarray 4x44K G4112F).
The ANXA1 gene is annexin A1 (nm_ 000700.3); CXCL13 gene C-X-C motif chemokine ligand 13 (NM-001371558.1); MMP1 gene matrix metallopeptidase1 (NM-002421); CXCL1 gene is C-X-C motif chemokine ligand 1 (NM-001511).
The invention provides a method for establishing a model for IBD differential diagnosis by utilizing metalloproteinase family related genes (MMPs-associated genes), and verification results thereof in a plurality of central data queues. Matrix Metalloproteinases (MMPs) are a group of zinc-dependent neutral peptidases that degrade all components of the extracellular matrix (extracellular matrix, ECM), associated with extensive mucosal degradation and tissue remodeling, ultimately contributing to the development of ulcers, fistulae and stenosis, and thus MMPs are an important gene family involved in and regulating the progression of the course of inflammatory bowel disease. To date, there is sufficient evidence that IBD-associated mucosal inflammation is associated with enhanced induction of various MMPs, and that at least 3 clinical trials of matrix metalloproteinase inhibitors have been publicly reported in the context of IBD treatment. Our study showed that the MMPs related gene set is also the main differential gene set between CD and UC. In order to overcome the difference of different source data queue detection platforms, the expression quantity of MMPs related gene sets is converted into binary variables, and based on the binary variables, a differential diagnosis model is established by minimum absolute shrinkage and selection operator (LASSO) logistic regression to distinguish CD and UC. Finally, the patent also verifies the model in the IBD queue meeting the requirements, which is published at present, and achieves better effect. Thus, our diagnostic model provides a promising diagnostic tool, potentially improving clinical practice very quickly.
Advantages of this method include: 1) The establishment and verification of the method integrates the chip data of most of CD and UC reported in the prior art, is very critical to the result of the combination multi-center research of large sample size for IBD diseases with higher heterogeneity, and meanwhile, has not been reported in the prior art on the gene expression model for differential diagnosis of UC and CD; 2) In the method, different technical routes are adopted to carry out integrated analysis on the multi-center IBD queue, so that bias caused by a single integrated data set method is effectively reduced; 3) The evaluation steps of the model strictly follow the current clinical model evaluation guideline TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis), and the evidence belonging to the highest level in the quality evaluation of the guideline is evaluated by distinguishing, calibrating and clinical applicability in different centers and different queues respectively; 4) The constructed model gives up the specific expression value of the MMPs related gene sets, but is based on binary variables converted by the MMPs related gene sets, so that the problem of batch difference of different chip detection platform sources is better solved, and the method has higher clinical use value.
Drawings
FIG. 1 is a diagram of a protein interaction network constructed from a differential gene (differentially expressed genes, DEGs) obtained by screening based on RRA method, and a diagram of an important gene module identified by MCODE.
FIG. 2 is a diagram of a protein interaction network constructed based on data integration of the found DEGs and a diagram of the important gene modules identified by MCODE.
FIG. 3 is a schematic diagram of a process for determining final inclusion model genes based on LASSO regression and cross-validation. The left broken line is the punishment coefficient log (lambda) corresponding to the optimal AUC area determined by cross validation; the right dashed line is the penalty factor log (λ) corresponding to the optimal AUC area+1 standard error.
Fig. 4 is a nomogram drawn based on the build model.
FIG. 5 is a diagram of diagnostic capabilities of the build model in a training queue, including ROC, calibration curve and Decision Curve Analysis (DCA).
Fig. 6 is a graph of diagnostic ability of the build model in a validation queue (GSE 75214), including ROC curves, calibration curves and decision curves.
Fig. 7 is a graph of diagnostic ability of the build model in a validation queue (GSE 179285), including ROC curves, calibration curves and decision curves.
Detailed Description
The following detailed description of the invention is provided in connection with the accompanying drawings that are presented to illustrate the invention and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the invention in any way.
The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified.
Example 1
The invention provides a method for establishing a model for IBD differential diagnosis by utilizing metalloproteinase family related genes (MMPs-associated genes), and verification results thereof in a plurality of central data queues. Matrix Metalloproteinases (MMPs) are a group of zinc-dependent neutral peptidases that degrade all components of the extracellular matrix (extracellular matrix, ECM), associated with extensive mucosal degradation and tissue remodeling, ultimately contributing to the development of ulcers, fistulae and stenosis, and thus MMPs are an important gene family involved in and regulating the progression of the course of inflammatory bowel disease. To date, there is sufficient evidence that IBD-associated mucosal inflammation is associated with enhanced induction of various MMPs, and that at least 3 clinical trials of matrix metalloproteinase inhibitors have been publicly reported in the context of IBD treatment. Our study showed that the MMPs related gene set is also the main differential gene set between CD and UC. In order to overcome the difference of different source data queue detection platforms, the expression quantity of MMPs related gene sets is converted into binary variables, and based on the binary variables, a differential diagnosis model is established by minimum absolute shrinkage and selection operator (LASSO) logistic regression to distinguish CD and UC. Finally, the patent also verifies the model in the IBD queue meeting the requirements, which is published at present, and achieves better effect. Thus, our diagnostic model provides a promising diagnostic tool, potentially improving clinical practice very quickly.
1. Determining and incorporating data sets to be analyzed
By means of a Gene Expression Omnibus (GEO) database (https:// www.ncbi.nlm.nih.gov/GEO /) search, the keywords are as follows: the total 139 data sets were retrieved ("Inflammatory Bowel Diseases" [ MeSH terminals ] OR Inflammatory Bowel Diseases [ All Fields ]) AND "Homo sapiens" [ gargn ] AND ("Expression profiling by array" [ Filter ] AND ("2008/01/01" [ PDAT ]) AND were manually screened according to inclusion criteria of (1) samples with a sample size greater than 15, (2) samples with simultaneous coverage of CD AND UC in the data set, (3) samples from intestinal mucosa of the ileum or colon excluding blood AND other sources, (4) available genetic annotation information, finally 5 different central data sets were included, including GSE75214 (n=59/74, sample size=cd/UC, the following), GSE10616 (n=32/10), GSE36807 (n=13/15), AND GSE9686 (n=11/5).
TABLE 1
2. Integrated analysis of different data sets based on Robust Rank Aggregation (RRA) analysis method
Based on RRA method, we integrated 4 different source data sets (GSE 75214, GSE10616, GSE36807 and GSE 9686), and finally identified differential genes (differentially expressed genes, DEGs) by taking logFC > 0.7 and adjP < 0.05 as standards, and identified 141 differential genes in total. Details are shown in Table 2. The protein interaction network was thus constructed using the String website (https:// cn. String-db. Org /) and Cytoscape software (v3.7.2), and important functional groups were identified by MCODE (molecular complex detection) plug-ins, the major members of which were all the MMP family, see figure 1 (in figure 1A, the genes upregulated in UC are shown orange, the genes upregulated in CD are shown blue, and the most important gene modules identified by the software are shown yellow in figure 1B. The yellow indicates seed genes), including seed genes for MMP1, MMP12, PLAU, MMP9, CXCL1, MMP10, PTGS2, TIMP1, and MMP7, with MMP3 as the group.
TABLE 2
3. Method for carrying out integrated analysis on different data sets based on batch correction and merging
To reduce the bias of RRA methods, another approach was introduced to integrate the data set. Firstly, since GSE10616, GSE36807 and GSE9686 data sets are derived from the same chip platform (GPL 570), batch correction and merging are performed on 3 queues by using SVA packets in R software, the newly generated data sets are named as merged data sets (Combined Datasets), then difference analysis is performed on Combined Datasets and GSE75214 respectively, finally DEGs are identified by using logFC > 0.6 and adjP < 0.1 as standards, and finally intersection sets are taken for DEGs identified by 2 data sets, so that 65 DEGs are obtained in total, see table 3. The PPI network was constructed again according to the above method and the most important gene modules were identified by MCODE, wherein the genes constituting the modules still consisted mainly of MMPs family genes including MMP12, MMP10, MMP3, MMP9, TIMP1, CXCL1, PLAU, S100A9, CXCL13, S100A8, ANXA1 and S100A12, and MMP7 as seed genes, see FIG. 2 (in FIG. 2A, the genes upregulated in UC are shown orange, the genes upregulated in CD are shown blue, the most important gene modules identified by software are shown in yellow. The gene modules are further shown in FIG. 2B, yellow indicates seed genes).
TABLE 3 Table 3
/>
4. Construction of Lasso logistic regression model
Based on the two different technical routes, the related genes of MMPs are considered to be the most important differential gene sets in UC and CD, the gene sets identified by the 2 methods are combined, and 15 genes are obtained after repeated genes are removed: MMP3, MMP1, MMP12, PLAU, MMP9, CXCL1, MMP10, PTGS2, TIMP1, MMP7, CXCL13, S100a12, S100A8, S100A9, and ANXA1.
In order to overcome the problem of model application caused by batch differences between different chip platforms, we performed binary variable conversion on 15 candidate genes: for a gene whose expression is increased in UC, if the expression value of the gene is greater than the median of the expression values of the gene in all samples, then the binary variable for the MMP-related gene is assigned a value of 1; otherwise, the index is defined as 0. For genes whose expression is increased in CD, if the expression value of the gene is less than the median of the expression values of the gene in all samples, the binary variable of the MMP-related gene is assigned a value of 1; otherwise, the exponent is defined as 0. Thus, the expression values of 15 genes were converted from continuous variable to binary variable. For example, for a patient in Combined Datasets, ANXA1, MMP10, CXCL13, TIMP1, MMP3, MMP7, MMP9, S100a12, PLAU, MMP12, S100A9, PTGS2, CXCL1, S100A8 are all genes whose expression levels are up-regulated in UC, their expression levels are 1.9734573,1.9701188,1.1136878,2.8159726,2.7689527,4.7186331,2.0414428,2.1097156,1.7163029,2.1842115,2.4673306,2.9328217,1.6551834,5.2526517,2.4706825, respectively, and their numbers of digits are 3.4117391,3.2046994,3.44135835,5.10064625,4.923122,5.00327205,3.33740685,4.17297635,2.2498484,3.638494,5.400392,3.835166,2.6820964,5.1378286,4.3677868, respectively, and the binary variable of 15 genes is changed to 0,0,0,0,0,0,0,0,0,0,0,0,0,1,0 after conversion.
Combined Datasets is then set to the training set and GSE75214 is set to the validation set to verify the effect of the model. To determine the optimal penalty factor, we performed an 8-fold cross-validation and used the area under the receiver operating characteristic curve (ROC) curve as a performance metric to determine the final model with maximum lambda (optimal AUC corresponds to lambda plus one standard error) as the penalty factor. The cross-validation diagram of the model construction is shown in fig. 3 (the left dashed line is the lambda coefficient corresponding to the maximum AUC, the right dashed line is the lambda coefficient corresponding to the maximum AUC plus a standard error, i.e. the penalty coefficient selected by the present procedure).
The final differential diagnosis model is constructed as follows:
P UC =exp(MMPs Scores)/(1+exp(MMPs Scores)) (1)
MMPs Scores=-1.3813+[ANXA1×(0.6358)]+[CXCL13×(0.1000)]+[MMP1×(0.2507)]+[CXCL1×(0.4478)](2)
P UC +P CD =1 (3)
note that: p (P) UC For calculation from the model, the probability that the case is predicted to be UC, P is calculated because the model is a discriminating model of UC and CD UC +P CD =1, the model is predicted as P CD The probability of (2) may be defined by P UC Indirectly obtaining the product.
For more convenient application of the authentication model, the model is constructed as a nomogram and is shown in fig. 4. In fig. 4 we take the red dots as an example of application. For example, for patients with CXCL13 value of 0, MMP1 value of 1, ANXA1 value of 0, and CXCL1 value of 1, the predictive probability of UC diagnosis is 0.336, while the predictive probability of CD diagnosis is 0.664. Based on a cutoff value of 0.5, the patient was determined to have CD according to the model constructed by the present method.
5. Model evaluation
According to the model, training set (data sets GSE10616, GSE36807 and GSE 9686), validation set 1 (data set GSE 75214) and validation set 2 (data set GSE 179285) were model-constructed as described above, and the constructed model was distinguished (ROC curve), calibration degree (calibration curve) and clinical applicability (DCA curve) were examined, respectively, with the following results:
1. training set data results display: combined Datasets the area under the ROC curve is 0.801, the calibration curve results show a better calibration effect (Sp >0.05, brier score < 0.25), and DCA curves show better clinical compliance (as shown in fig. 5).
2. Verification group 1 data results show: the area under the ROC curve of GSE75214 is 0.811, the calibration curve results show a better calibration effect (Sp >0.05, brier score < 0.25), and the DCA curve shows better clinical compliance (as shown in fig. 6). Meanwhile, the training set data is from the chip platform GPL570, and the verification set data is from the chip platform GPL6244, which shows that the model has good performance on different platforms.
3. Validation set 2 data results presentation: since the data sets are all used for screening genes, a group of newly issued data team GSE179285 columns are selected for model verification, the area under the ROC curve of GSE179285 is 0.751, the calibration curve result shows that the calibration effect is good (Sp >0.05 and Brier score < 0.25), and the DCA curve shows good clinical adaptability (as shown in FIG. 7). Meanwhile, the training set data is from the chip platform GPL570, and the verification set data is from the chip platform GPL6480, which shows that the model has good performance on different platforms.
The present invention is described in detail above. It will be apparent to those skilled in the art that the present invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with respect to specific embodiments, it will be appreciated that the invention may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.

Claims (7)

1. An auxiliary device for judging Crohn's disease and ulcerative colitis comprises parameter acquisition equipment and a readable carrier;
the parameter acquisition device comprises a device for acquiring various parameters involved in the readable carrier;
the readable carrier has recorded thereon the following formulas (1) - (3),
P UC =exp(MMPs Scores)/(1+exp(MMPs Scores)) (1)
MMPs Scores=-1.3813+[ANXA1×(0.6358)]+[CXCL13×(0.1000)]+[MMP1×(0.2507)]+[CXCL1×(0.4478)](2)
P UC +P CD =1 (3);
wherein P is UC The probability of the sample to be tested being predicted as ulcerative colitis; p (P) CD The probability of being predicted as Crohn's disease for the case under test; ANXA1, CXCL13, MMP1, CXCL1 are binary variables of the ANXA1, CXCL13, MMP1, CXCL1 genes, respectively; if the expression value of the gene in the sample to be tested is larger than the median value of the expression value of the gene in the ulcerative colitis sample, the binary variable of the gene is assigned to be 1; otherwise, the binary variable of the gene is assigned a value of 0;
when P UC When the sample to be detected is more than 0.5, the sample to be detected is ulcerative colitis; when P UC And when the sample to be tested is less than 0.5, the sample to be tested is Crohn disease.
2. The apparatus according to claim 1, wherein: the parameter acquisition equipment is a device for detecting the expression quantity of ANXA1, CXCL13, MMP1 and CXCL1 genes in a sample to be detected.
3. The apparatus according to claim 1 or 2, characterized in that: the apparatus further comprises recording means and/or computing means; the recording means comprises a pen and/or a computer; the computing means comprises a calculator and/or the computer.
4. The apparatus according to claim 1 or 2, characterized in that: the readable carrier is a kit instruction; the content of formula I is printed on a card.
5. The apparatus according to claim 1 or 2, characterized in that: the readable carrier is a computer readable carrier.
6. The apparatus according to claim 1 or 2, characterized in that: the median value of the expression values of the genes in the ulcerative colitis samples is obtained by detecting the gene expression values of at least 10 ulcerative colitis samples by using the same detection device, and the average value of the expression values of the ulcerative colitis samples is obtained as the median value of the expression values in the ulcerative colitis samples.
7. The kit for assisting in judging the Crohn disease and the ulcerative colitis is characterized by comprising a device for detecting the expression level of ANXA1, a device for detecting the expression level of CXCL13, a device for detecting the expression level of MMP1, a device for detecting the expression level of CXCL1 and a computing device provided with a parameter operation module; the parameter operation module can perform operations of the following formulas (1) - (3):
P UC =exp(MMPs Scores)/(1+exp(MMPs Scores)) (1)
MMPs Scores=-1.3813+[ANXA1×(0.6358)]+[CXCL13×(0.1000)]+[MMP1×(0.2507)]+[CXCL1×(0.4478)](2)
P UC +P CD =1 (3);
wherein P is UC The probability of the sample to be tested being predicted as ulcerative colitis; p (P) CD The probability of being predicted as Crohn's disease for the case under test; ANXA1, CXCL13, MMP1, CXCL1 are binary variables of the ANXA1, CXCL13, MMP1, CXCL1 genes, respectively; if the expression value of the gene in the sample to be detected is larger than the median value of the expression value of the gene in the sample, the binary variable of the gene is assigned to be 1; otherwise, the binary variable of the gene is assigned a value of 0;
when P UC When the sample to be detected is more than 0.5, the sample to be detected is ulcerative colitis; when P UC And when the sample to be tested is less than 0.5, the sample to be tested is Crohn disease.
CN202310559017.XA 2022-10-28 2023-05-17 Device for differential diagnosis of Crohn's disease and ulcerative colitis Active CN116631510B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211340533 2022-10-28
CN2022113405335 2022-10-28

Publications (2)

Publication Number Publication Date
CN116631510A CN116631510A (en) 2023-08-22
CN116631510B true CN116631510B (en) 2024-01-12

Family

ID=87601934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310559017.XA Active CN116631510B (en) 2022-10-28 2023-05-17 Device for differential diagnosis of Crohn's disease and ulcerative colitis

Country Status (1)

Country Link
CN (1) CN116631510B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009192383A (en) * 2008-02-14 2009-08-27 Kanazawa Univ Diagnosis and medical treatment of crohn's disease using ergothioneine
CN105219844A (en) * 2015-06-08 2016-01-06 刘宗正 A kind of compose examination 11 kinds of diseases gene marker combination, test kit and disease risks predictive model
CN108403711A (en) * 2017-02-10 2018-08-17 中国科学院上海生命科学研究院 A kind of microRNA for detecting and treating inflammatory bowel disease
CN109994214A (en) * 2019-04-13 2019-07-09 中国医学科学院北京协和医院 The identification model and model building method of Crohn disease and the white plug of intestines
CN109998488A (en) * 2019-04-13 2019-07-12 中国医学科学院北京协和医院 The identification model and construction method of Crohn disease and enteron aisle ulcer type lymthoma
CN113744802A (en) * 2021-08-25 2021-12-03 聂凯 Screening method and application of gene marker for predicting Crohn's disease treatment response
CN114732899A (en) * 2015-01-09 2022-07-12 辉瑞公司 Dosing regimens for MAdCAM antagonists
CN114974595A (en) * 2022-05-13 2022-08-30 江苏省人民医院(南京医科大学第一附属医院) Crohn's disease patient mucosa healing prediction model and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
MXPA04005219A (en) * 2001-11-29 2005-06-20 Greystone Medical Group Inc Treatment of wounds and compositions employed.

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009192383A (en) * 2008-02-14 2009-08-27 Kanazawa Univ Diagnosis and medical treatment of crohn's disease using ergothioneine
CN114732899A (en) * 2015-01-09 2022-07-12 辉瑞公司 Dosing regimens for MAdCAM antagonists
CN105219844A (en) * 2015-06-08 2016-01-06 刘宗正 A kind of compose examination 11 kinds of diseases gene marker combination, test kit and disease risks predictive model
CN108403711A (en) * 2017-02-10 2018-08-17 中国科学院上海生命科学研究院 A kind of microRNA for detecting and treating inflammatory bowel disease
CN109994214A (en) * 2019-04-13 2019-07-09 中国医学科学院北京协和医院 The identification model and model building method of Crohn disease and the white plug of intestines
CN109998488A (en) * 2019-04-13 2019-07-12 中国医学科学院北京协和医院 The identification model and construction method of Crohn disease and enteron aisle ulcer type lymthoma
CN113744802A (en) * 2021-08-25 2021-12-03 聂凯 Screening method and application of gene marker for predicting Crohn's disease treatment response
CN114974595A (en) * 2022-05-13 2022-08-30 江苏省人民医院(南京医科大学第一附属医院) Crohn's disease patient mucosa healing prediction model and method

Also Published As

Publication number Publication date
CN116631510A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
JP7368483B2 (en) An integrated machine learning framework for estimating homologous recombination defects
US10354747B1 (en) Deep learning analysis pipeline for next generation sequencing
Denny et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data
Gautier et al. Alternative mapping of probes to genes for Affymetrix chips
CN110634573A (en) Clinical cerebral infarction patient recurrence risk early warning scoring visualization model system and evaluation method thereof
Gorcenco et al. New generation genetic testing entering the clinic
Lamri et al. Fine-tuning of genome-wide polygenic risk scores and prediction of gestational diabetes in South Asian women
Marker et al. Homozygous deletion of CDKN2A by fluorescence in situ hybridization is prognostic in grade 4, but not grade 2 or 3, IDH-mutant astrocytomas
US20220277811A1 (en) Detecting False Positive Variant Calls In Next-Generation Sequencing
WO2023071877A1 (en) Prediction model, and evaluation system and method for postoperative recurrence risk of urolithiasis
Momozawa et al. Genome wide association study of 40 clinical measurements in eight dog breeds
Bianco et al. The association between HMGA1 rs146052672 variant and type 2 diabetes: a transethnic meta-analysis
CN116200490A (en) Method for detecting tiny residual focus of solid tumor
Yun et al. Genetic risk score raises the risk of incidence of chronic kidney disease in Korean general population-based cohort
Dann et al. Precise identification of cell states altered in disease using healthy single-cell references
CN116631510B (en) Device for differential diagnosis of Crohn&#39;s disease and ulcerative colitis
Wang et al. Systematic benchmarking of imaging spatial transcriptomics platforms in FFPE tissues
CN117079723A (en) Biomarker and diagnostic model related to amyotrophic lateral sclerosis and application of biomarker and diagnostic model
CN109182490B (en) LRSAM1 gene SNP mutation site typing primer and application thereof in coronary heart disease prediction
US20220267837A1 (en) Methods for identifying carrier status and assessing risk for spinal muscular atrophy
Steuerman et al. Exploiting gene-expression deconvolution to probe the genetics of the immune system
CN114566213A (en) Single-parent diploid analysis method and system for family high-throughput sequencing data
CN112820410A (en) Clinical cerebral infarction patient recurrence risk early warning scoring visualization model system and evaluation method thereof
KR20220075700A (en) Type 2 diabetes mellitus prediction system using genome-wide Polygenic Risk Score
Lindemann et al. A low-cost sequencing platform for rapid genotyping in ADPKD and its impact on clinical care

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant