CN117288962A - Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product - Google Patents

Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product Download PDF

Info

Publication number
CN117288962A
CN117288962A CN202311304291.9A CN202311304291A CN117288962A CN 117288962 A CN117288962 A CN 117288962A CN 202311304291 A CN202311304291 A CN 202311304291A CN 117288962 A CN117288962 A CN 117288962A
Authority
CN
China
Prior art keywords
marker
biliary
a0a5c2fx14
a0a1s5uz16
marker combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311304291.9A
Other languages
Chinese (zh)
Inventor
张锐忠
付铭
童燕陆
王贺珍
陈虹交
陈严
夏慧敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Women and Childrens Medical Center
Original Assignee
Guangzhou Women and Childrens Medical Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Women and Childrens Medical Center filed Critical Guangzhou Women and Childrens Medical Center
Priority to CN202311304291.9A priority Critical patent/CN117288962A/en
Publication of CN117288962A publication Critical patent/CN117288962A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6893Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to diseases not provided for elsewhere
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/08Hepato-biliairy disorders other than hepatitis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/50Determining the risk of developing a disease
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/60Complex ways of combining multiple protein biomarkers for diagnosis

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Genetics & Genomics (AREA)
  • Hematology (AREA)
  • Urology & Nephrology (AREA)
  • Chemical & Material Sciences (AREA)
  • Pathology (AREA)
  • Biotechnology (AREA)
  • Public Health (AREA)
  • Analytical Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • Medicinal Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Food Science & Technology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biochemistry (AREA)
  • Cell Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The application discloses application of a reagent for detecting a marker combination in preparing a biliary tract locking diagnosis product. In a first aspect of the present application there is provided the use of a reagent for detecting a marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt in the manufacture of a biliary latchup diagnostic product. The diagnosis combination model constructed by the marker has extremely high AUC value by ROC curve analysis, and simultaneously, the accuracy, sensitivity and specificity of the model can reach higher level under a set cut-off value. Therefore, the combination of A0A5C2FX14, A0A1S5UZ16 and P01833 combined with gamma GT can be used as a BA early diagnosis marker, so that the efficient diagnosis of biliary tract occlusion is realized.

Description

Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product
Technical Field
The application relates to the technical field of liver and gall disease diagnosis, in particular to application of a reagent for detecting a marker combination in preparation of biliary tract locking diagnosis products.
Background
Biliary tract occlusion (BA) is a biliary tract obstructive disease mainly represented by neonatal jaundice, and has a poor prognosis, a high mortality rate, and an unknown etiology and pathogenesis. Basic pathological features of BA are progressive inflammation of intrahepatic and extrahepatic ducts, bile duct occlusion and hepatic fibrosis. Its liver fibrosis progresses faster and is more invasive than other adult diseases. The Kasai operation is a first-line treatment method of BA, which can relieve cholestasis symptoms of infants suffering from BA, but most infants successfully applying the Kasai operation still develop liver failure due to subsequent progressive destruction of intrahepatic bile ducts, and finally need to receive liver transplantation treatment. BA infants are often diagnosed 1-4 months after birth, and most of the infants' livers in this period are already in a liver fibrosis or even cirrhosis state, so that the surgical treatment effect is poor. Therefore, it is extremely important to find an early diagnosis method of BA.
At present, more BA diagnosis methods are available, for example, serum bilirubin dynamic observation is realized by periodically measuring the change of the serum bilirubin content, but the specificity of the method is not obvious, and the identification difficulty is high; ultrasound examination, mainly referring to related indexes of morphological change and contraction function of gall bladder, whereas the study and judgment of ultrasound image depend on experience knowledge of inspector; the 99mTc-diethyl iminodiacetic acid (DIDA) excretion test has higher extraction rate of liver cells, but the diagnosis time for identifying BA is too long; lipoprotein-X (Lp-X) quantification, lp-X being effective in reflecting cholestasis, however distinguishing biliary closure from liver and gall disease of various associated symptoms is difficult; bile acid quantitative determination is of diagnostic value, however, there are significant false positives in some children; liver puncture pathological histology is difficult to diagnose and belongs to invasive examination; the biliary tract radiography examination is a gold standard for diagnosing biliary tract locking at present, but needs to meet certain age requirements, so that the operation prognosis is poor. Therefore, there is a need for a simple and rapid early diagnosis of biliary tract occlusion with higher specificity and sensitivity.
Disclosure of Invention
The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides the application of the reagent for detecting the marker combination in preparing the biliary tract locking diagnosis product, and the early diagnosis with higher specificity and sensitivity, simplicity and rapidness can be realized by using the reagent.
In a first aspect of the present application there is provided the use of a reagent for detecting a marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt in the manufacture of a biliary locking diagnostic product.
According to the application of the embodiment of the application, at least the following beneficial effects are achieved:
the diagnosis combination model constructed by the marker has extremely high AUC value by ROC curve analysis, and simultaneously, the accuracy, sensitivity and specificity of the model can reach higher level under a set cut-off value. Therefore, the combination of A0A5C2FX14, A0A1S5UZ16 and P01833 combined with gamma GT can be used as a BA early diagnosis marker, so that the efficient diagnosis of biliary tract occlusion is realized.
In some embodiments of the present application, the reagent is used to detect marker combinations at the protein level.
In some embodiments of the present application, the reagent detects the marker combination at the protein level by at least one of immunostaining, immunofluorescence, immunochromatography, western blotting, ELISA, flow cytometry, spectrophotometry, infrared spectrometry, mass spectrometry, chromatography, colorimetry.
In some embodiments of the present application, the agent comprises an antibody that specifically binds to a marker.
In some embodiments of the present application, the reagent is used to detect a marker combination of any of blood, serum, plasma.
In some embodiments of the present application, biliary tract occlusion diagnostic products are used to diagnose children of 0 to 4 months of age, such as children of 1 month of age, 2 months of age, 3 months of age, 4 months of age.
In some embodiments of the present application, biliary tract occlusion diagnostic products are used to diagnose children within 60 days of age, such as children within 55 days of age, children within 50 days of age, children within 45 days of age, children within 40 days of age, children within 35 days of age, children within 30 days of age, children within 25 days of age, children within 20 days of age, children within 15 days of age, children within 10 days of age.
In some embodiments of the present application, biliary tract occlusion diagnostic products are used to distinguish between intrahepatic bile stasis and biliary tract occlusion.
In a second aspect of the present application, there is provided a biliary latchup diagnostic product comprising reagents for detecting a marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and γgt.
In some embodiments of the present application, the biliary tract occlusion diagnostic product includes at least one of an immunostaining kit, an immunofluorescent kit, an immunochromatographic kit, a western blot kit, an ELISA kit, a flow cytometry kit, and the like that detect a marker combination.
In a third aspect of the present application, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to:
step 1: obtaining information on expression levels of a marker combination in a sample from a subject, the marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt;
step 2: mathematically correlating the expression levels to obtain a score; the score is used to indicate the risk of biliary closure in the subject.
In some embodiments of the present application, the method of mathematically correlating the expression levels of individual markers in a marker combination comprises constructing a scored diagnostic model.
In some embodiments of the present application, the method of constructing the scored diagnostic model includes constructing the diagnostic model by at least one of a random forest algorithm, a logistic regression algorithm, a support vector machine algorithm, a K-nearest neighbor algorithm.
In some embodiments of the present application,e is a natural base, Z is according to the formulaCalculated, a i Weights b are set for the ith marker in A0A5C2FX14, A0A1S5UZ16, P01833 and γGT i Is the expression level of the i-th marker in A0A5C2FX14, A0A1S5UZ16, P01833 and gamma GT, a 0 To set the intercept. It will be appreciated that a 0 ~a 4 Different training sets may take different values according to different algorithms. In addition, the scoring formula may also select one or more other combinations according to different algorithms.
In some embodiments of the present application, the expression level of the i-th marker is the expression level of the i-th marker after pretreatment.
In some embodiments of the present application, the means for preprocessing includes at least one of normalization processing and normalization processing.
In some embodiments of the present application, the normalization process includes at least one of a linear normalization process (such as extremum method, standard deviation method), a polyline normalization process (such as tri-polyline method), a curvilinear normalization process (such as semi-normal distribution), and the like.
In some embodiments of the present application, the means for normalizing includes a Z-score normalization.
In some embodiments of the present application, scoring is used to indicate that the subject's risk of biliary closure is a risk of biliary closure in a child cholestatic disease.
In some embodiments of the present application, the pediatric cholestatic disease includes at least one of biliary atresia, alagille syndrome, primary sclerosing cholangitis, extrahepatic biliary obstruction, intrahepatic cholestasis, and the like.
In some embodiments of the present application, the means for scoring for indicating a risk of biliary closure in a subject comprises indicating that the risk of biliary closure in the subject is higher (biliary closure positive) when the score is above the cutoff value; when the score is not higher than the cutoff value, it indicates that the subject has a low risk of biliary closure (biliary closure negative).
It can be appreciated that the cutoff value can be reasonably and freely set according to factors such as models, training sets, diagnostic purposes and the like.
In some embodiments of the present application, the cutoff value is determined by a method of about a log index.
In some embodiments of the present application,e is a natural base, Z is calculated according to the formula Z= -0.52-1.25 xA 0A5C2FX14-1.5 xA 0A1S5UZ16-1.98 xP 01833-2.24 xgamma GT; the marker in the calculation formula of Z is the value of the expression level value of the corresponding marker protein after the Z-score standardization treatment. In some embodiments, the cut-off value is 0.66, and the subject is diagnosed with a high risk of biliary tract occlusion when the score is greater than 0.66; at a score of no greater than 0.66, the subject is diagnosed with a low risk of biliary tract occlusion.
In a fourth aspect of the present application, there is provided an electronic device comprising a processor and a memory, the memory having stored thereon a computer program executable on the processor, the processor when executing the computer program performing the operations of:
step 1: obtaining information on expression levels of a marker combination in a sample from a subject, the marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt;
step 2: mathematically correlating the expression levels to obtain a score; the score is used to indicate the risk of biliary closure in the subject.
In some embodiments of the present application, the method of mathematically correlating the expression levels of individual markers in a marker combination comprises constructing a scored diagnostic model.
In some embodiments of the present application, the method of constructing the scored diagnostic model includes constructing the diagnostic model by at least one of a random forest algorithm, a logistic regression algorithm, a support vector machine algorithm, a K-nearest neighbor algorithm.
In some embodiments of the present application,e is a natural base, Z is according to the formulaCalculated, a i Weights b are set for the ith marker in A0A5C2FX14, A0A1S5UZ16, P01833 and γGT i Is the expression level of the i-th marker in A0A5C2FX14, A0A1S5UZ16, P01833 and gamma GT, a 0 To set the intercept. It will be appreciated that a 0 ~a 4 Different training sets may take different values according to different algorithms. In addition, the scoring formula may also select one or more other combinations according to different algorithms.
In some embodiments of the present application, the expression level of the i-th marker is the expression level of the i-th marker after pretreatment.
In some embodiments of the present application, the means for preprocessing includes at least one of normalization processing and normalization processing.
In some embodiments of the present application, the normalization process includes at least one of a linear normalization process (such as extremum method, standard deviation method), a polyline normalization process (such as tri-polyline method), a curvilinear normalization process (such as semi-normal distribution), and the like.
In some embodiments of the present application, the means for normalizing includes a Z-score normalization.
In some embodiments of the present application, scoring is used to indicate that the subject's risk of biliary closure is a risk of biliary closure in cholestatic disease.
In some embodiments of the present application, the cholestatic disease includes at least one of biliary closure, idiopathic cholestasis, alagille syndrome, and progressive familial intrahepatic cholestasis.
In some embodiments of the present application, the means for scoring for indicating a risk of biliary closure in a subject comprises indicating that the risk of biliary closure in the subject is higher (biliary closure positive) when the score is above the cutoff value; when the score is not higher than the cutoff value, it indicates that the subject has a low risk of biliary closure (biliary closure negative).
It can be appreciated that the cutoff value can be reasonably and freely set according to factors such as models, training sets, diagnostic purposes and the like.
In some embodiments of the present application, the cutoff value is determined by a method of about a log index.
In some embodiments of the present application,e is a natural base, Z is calculated according to the formula Z= -0.52-1.25 xA 0A5C2FX14-1.5 xA 0A1S5UZ16-1.98 xP 01833-2.24 xgamma GT; the marker in the calculation formula of Z is the value of the expression level value of the corresponding marker protein after the Z-score standardization treatment. In some embodiments, the cut-off value is 0.66, and the subject is diagnosed with a high risk of biliary tract occlusion when the score is greater than 0.66; at a score of no greater than 0.66, the subject is diagnosed with a low risk of biliary tract occlusion.
In the embodiment of the application, the differential protein in the blood plasma is screened by using a proteomics DIA method, and a marker combination is constructed by using a machine learning method, so that the diagnosis model of A0A5C2FX14+A0A1S5UZ16+P01833 combined gamma GT in the blood plasma has an AUC value of 0.944 in a randomly divided test set, and when the cut-off value is 0.66, the accuracy of the model in the test set is 89.47%, the sensitivity is 90% and the specificity is 88.89%, so that the model can be used as an effective BA early diagnosis marker.
Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
Drawings
FIG. 1 is a candidate marker weight value ranking result in one embodiment of the present application. Wherein, the abscissa is the weight value, and the ordinate is the corresponding protein name.
FIG. 2 is a graph of cumulative trend of AUC of a candidate marker classification model in one embodiment of the present application. Wherein, the abscissa is the protein name of the continuously accumulated markers in the classification model, and the ordinate is the AUC value of the model after corresponding accumulation of the corresponding marker proteins.
FIG. 3 is a box plot of ROC analysis, sensitivity, specificity for three single model evaluations in one embodiment of the present application. Wherein the abscissa is the scoring values of AUC, sensitivity and specificity, the closer it is to 1 the better; the ordinate is the three algorithm models used, LR is logistic regression, SVM is support vector machine, and RF is random forest.
Fig. 4 is a graph of ROC curves for three single model evaluations in one embodiment of the present application. Wherein the abscissa is the false positive rate (false postive rate FPR, i.e. 1-specific) and the ordinate is the true positive rate (true postive rate TPR, sensitivity).
FIG. 5 is a graph of accuracy, sensitivity, and specificity of three single model evaluations as a function of cut-off value (cutoff) used in one embodiment of the present application. Wherein, from left to right, respectively represent the graph of a logistic regression model (LR), a support vector machine model (SVM) and a random forest model (RF); the abscissa is the corresponding cut-off value; the ordinate is the percentage corresponding to accuracy, sensitivity and specificity.
FIG. 6 is a graph of importance coefficients for random forest computation markers in one embodiment of the present application. Wherein the abscissa is the importance coefficient, and the ordinate is the corresponding marker protein name.
FIG. 7 is a box plot of the expression of candidate markers in samples of the disease control group and biliary tract occlusion group in one embodiment of the present application.
FIG. 8 is a result of correlation analysis between candidate markers in one embodiment of the present application. Wherein, the abscissa is the name of the corresponding candidate marker protein, the size and the area of the circle reflect the relation coefficient, the bigger the correlation, the bigger the area and the deeper the circle.
FIG. 9 is a plot of specificity versus cut-off (A) and accuracy versus cut-off (B) for one embodiment of the present application.
FIG. 10 is a ROC graph of randomly partitioned training and test sets applying the diagnostic model described above in one embodiment of the present application.
FIG. 11 is a ROC curve for different diagnostic combination models in the comparative examples of the present application.
Detailed Description
The conception and technical effects produced by the present application will be clearly and completely described below in connection with the embodiments to fully understand the objects, features and effects of the present application. It is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments, and that other embodiments obtained by those skilled in the art without inventive effort based on the embodiments of the present application are within the scope of the present application.
The following detailed description of embodiments of the present application is exemplary and is provided merely for purposes of explanation and not to be construed as limiting the application.
In the description of the present application, the meaning of a plurality means one or more, the meaning of a plurality means two or more, and the meaning of greater than, less than, exceeding, etc. is understood to exclude the present number, and the meaning of above, below, within, etc. is understood to include the present number, and the meaning of about means within the range of ±20%, 10%, 8%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.2%, 0.1% etc. of the present number. The description of the first and second is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present application, a description with reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Example 1
Materials and methods
Inclusion subjects included disease control group, which are infants (IHC, n=35) diagnosed with intrahepatic cholestasis (IHC), and biliary locking group; the biliary tract occlusion group is the infant (BA, n=32) diagnosed with biliary tract occlusion (BA); both groups were less than 60 days old and were all sourced from the female child medical center in Guangzhou, china (Guangzhou, china) of 2019-2021. Diagnosis of IHC and BA infants is based on clinical features of infants, hepatobiliary biochemical indexes, intraoperative cholangiography and postoperative extrahepatic bile duct histopathological detection.
The study was approved by the national institute review board (approval No.: # 34500) for children medical centers in Guangzhou, china, and all subjects signed written informed consent prior to the study.
The main reagents and consumables involved in the experimental procedure are shown in tables 1 and 2 below:
TABLE 1 sources of reagents
TABLE 2 Experimental instrument Source
The method in this embodiment is as follows:
EDTA anticoagulation tube collects 2mL of peripheral blood of BA and IHC infants, and the obtained supernatant is centrifuged, and the obtained supernatant is stored in a-80-DEG refrigerator for DIA proteomics detection with the following specific steps:
2 mug peptide fragments were taken for each sample, and an appropriate amount of iRT standard peptide fragments were respectively incorporated, and each sample was subjected to 1-time 90min DIA mass spectrometry. Chromatographic separation was performed using a nano-liter flow HPLC system Easy nLC. Wherein the chromatographic buffer and related parameters are as follows: buffer solution: solution A was 0.1% aqueous formic acid and solution B was 0.1% aqueous acetonitrile (84% acetonitrile). The column was equilibrated with 95% solution a. And (3) carrying out gradient separation on the sample by using a chromatographic Column of 25cm tip-Column after the sample is injected into the Trap Column, wherein the flow rate is 250nl/min. The liquid phase separation gradient is as follows: 0-70 minutes, the linear gradient of the liquid B is 8-30%; 70-80 minutes, and the linear gradient of the liquid B is from 30% to 100%; 80-90 minutes, the linear gradient of liquid B is raised to 100% and maintained.
The samples after nano-scale high performance liquid chromatography separation were subjected to DIA mass spectrometry using a Q-exact HF mass spectrometer (Thermo Scientific). Analysis duration: 90min, detection mode: positive ions. Primary mass spectrum scan range: 350-1650m/z, mass spectrum resolution: 120,000 (@ m/z 200), AGC target:3e6, maximum IT:50ms. MS2 adopts DIA data acquisition mode, sets 30 DIA acquisition windows, and mass spectrum resolution ratio: 30,000 (@ m/z 200), AGC target:3e6, maximum IT: auto, MS2 Activity Type: HCD, normalized collision energy:30,Spectral data type:profile.
The DIA data were processed using Spectronaut software (Spectronaut Pulsar X _ 12.0.20491.4) and the database was the same as that used for library construction. The software parameters were set as follows: retention time prediction type is set to dynamic iRT, interference on MS2 level correction is enabled, cross run normalization is enabled, and all results must pass through a set filter parameter Q Value cutoff of 0.01 (corresponding to FDR < 1%).
Machine learning processing is performed according to proteomics results, specifically as follows:
(1) Data preprocessing
And selecting a corresponding preprocessing method according to the characteristics of different types of data. Proteomics and metabonomics will generally fill in missing values in the data using knn method, expression value singletons using log transformation and multiplex pooled analysis using Z-score normalization.
(2) Statistical test method pre-screening
All material data were analyzed by T-test or other statistical methods to screen out materials with significant differences in the comparison group.
(3) Integrated learning method secondary screening
And (3) performing secondary screening by using an integrated machine learning method on the basis of the material data with significant differences, and integrating by using a certain strategy through constructing and combining a plurality of feature selection algorithms to obtain a final result. The characteristic selection method includes filtering method, packaging method, and packaging method. Each feature selection method produces a set of potential markers.
And (3) scoring and evaluating substances in each candidate marker set, wherein a scoring rule calculates the accumulated score of each candidate marker according to indexes such as frequency importance, correlation coefficient and the like of the candidate marker selected by the method. The ranking is from high to low according to the score of each candidate marker, with higher scores indicating a greater contribution of the substance in distinguishing sample sets.
ROC analysis was used to evaluate the combination of selection of the optimal potential markers. And adding the substances into the optimal candidate marker combination in sequence from high to low according to the score ranking of the substances, calculating the AUC value of a model constructed by the optimal candidate marker set after adding one substance each time, and stopping adding the substances into the optimal candidate marker combination until the change of the AUC value tends to be gentle and does not rise any more. And finally, selecting the substance set with the highest AUC value as the optimal potential marker combination.
(4) Marker validation and evaluation
Three machine learning models currently in common use are utilized: logistic regression (Logistic Regression, LR), random Forest (RF), support vector machine (Support Vector Machine, SVM), and K-fold cross validation of the model constructed with the optimal potential marker combinations, respectively. And (3) judging the performance quality of the optimal potential marker combination on the classification of the sample group by using ROC curve analysis.
(5) Diagnostic model validation and evaluation
The diagnostic model is constructed using logistic regression algorithms and data in the optimal potential marker combinations. The optimal cut-off value for the diagnostic decision is defined using the about index (Youden's index). The cut-off value corresponding to the maximum approximate sign index is the optimal critical point of the biomarker discrimination capability, the sum of sensitivity and specificity is maximum at the moment, the data set of the optimal potential marker combination in the sample is divided into a training set and a testing set according to proportion randomly, and ROC analysis is carried out on the training set and the testing set respectively by using a logistic regression model.
The results of the above experiments are as follows:
(1) Feature weight calculation
And calculating the comprehensive weight value of each marker by combining the reward scores obtained in each feature selection method, and sequencing the markers from large to small according to the comprehensive weight values, wherein the larger the comprehensive weight value is, the larger the contribution of the marker in distinguishing the experimental group sample from the control sample is. Referring to fig. 1, among these different feature selection methods, A0A5C2FX14 (iglc680_light_igk1-39_igkj2), A0A1S5UZ16 (Target of mesh-SH 3), P01833 (Polymeric immunoglobulin receptor), and γgt are four features in which the contribution is greatest (frequently selected).
(2) Selection of candidate markers
In order to effectively select candidate markers, ROC analysis is utilized to evaluate the influence intensity of each protein on the AUC value of the model, the result is shown in FIG. 2, and on the basis of a single factor model of A0A5C2FX14, along with the continuous introduction of other markers into the model, the AUC value is higher and higher, so that the classification effect of the model is better and better.
(3) Verification of candidate markers
And constructing a logistic regression, a support vector machine and a random forest model aiming at the combination of the four candidate markers, and judging the performance quality of the candidate markers for classifying different sample groups through ROC curve analysis. The results are shown in the box plots and ROC curves of fig. 3 and 4. It can be seen that the area under the curve values of the three models based on the four candidate markers are all closer to 1, which shows that the three models have higher clinical diagnosis efficacy, and simultaneously, the specificity and the sensitivity are relatively higher, and the detection efficacy is good. The results show that the combination of the candidate markers has excellent classifying ability and effect.
Fig. 5 further shows graphs of accuracy, sensitivity and specificity versus cut-off values for the three ROC models, respectively. The accuracy, sensitivity and specificity of each group in the three models are respectively provided with one-to-one corresponding cut-off values, so that the ROC curve can be utilized to judge the optimal cut-off value and the accuracy, sensitivity and specificity corresponding to the optimal cut-off value.
(4) Characterization of candidate markers
And calculating the importance coefficient of the candidate marker by using the classification model constructed by the random forest algorithm, and comparing the contribution of each marker to the model. The higher importance coefficient of the marker indicates that the marker contributes more to the differentiation of different groups, as a result of which see fig. 6, the contributions in the model are γgt, P01833, A0A5C2FX14 and A0A1S5UZ16, respectively.
(5) Protein expression level of candidate markers
The relevant expression levels of the candidate markers are shown in fig. 7, and it can be seen from the graph that the selected candidate markers have extremely significant differential expression in both the disease control group and the biliary tract occlusion group.
(6) Correlation between candidate markers
In general, the lower the correlation between each candidate marker in a diagnostic combination, the lower the overlap between the selected markers, and the more optimal the combination. The Pearson correlation coefficient was calculated for the expression levels of the candidate markers, and as can be seen from fig. 8, the correlation between the markers was low.
(7) Construction of diagnostic combination model and capability assessment
The candidate marker diagnosis combination model is constructed by using a logistic regression algorithm, and the model results are shown in table 3:
TABLE 3 logistic regression model coefficients for markers
The logistic regression model formula for the final biomarker combination is as follows:
wherein Z= -0.52-1.25 xA 0A5C2FX14-1.5 xA 0A1S5UZ16-1.98 xP 01833-2.24 xgamma GT;
the marker in the calculation formula of Z is the value of the expression level value of the corresponding marker protein after the Z-score standardization treatment.
During diagnosis, a probability value p is calculated according to the formula, and if the p value exceeds the cutoff value, the diagnosis is positive.
The optimal cut-off value of the diagnosis judgment is defined by using the Youden index (Youden's index), and the sensitivity and the specificity can be better. It will be appreciated that the optimal cut-off value is not unique and that different cut-off values may be selected according to different requirements for sensitivity and specificity.
Wherein, the high sensitivity is often applied to diagnosing diseases with serious illness state and good curative effect so as to prevent diagnosis; the disease may be caused by a variety of diseases, for excluding the possibility of a certain disease; screening for a disease, screening for census or periodic health checks. High specificity is often used to diagnose patients with a high probability of a disease for definitive diagnosis; diseases with serious diseases but poor curative effect and prognosis are prevented from misdiagnosis; the radical treatment of diseases needs to be diagnosed when there is a large damage, so as to avoid unnecessary damage to patients.
The best cut-off value calculated using the about log index and the corresponding index are shown in table 4 below, and the cut-off value defined finally in this example is 0.66. The trends among the accuracy, the specificity, the sensitivity and the cut-off value of the model are shown in fig. 9, and the change relations among the accuracy, the cut-off value, the sensitivity, the specificity and the cut-off value are respectively shown. Each point on the curve has a corresponding cut-off value, and the corresponding accuracy, sensitivity and specificity can be found according to the optimal cut-off value.
TABLE 4 optimal cut-off values and related diagnostic parameter values
(8) Combined diagnostic capability assessment
The original data set is randomly divided into a training set and a test set according to the proportion of 70% and 30%, and ROC analysis results are shown in figure 10 by using the constructed diagnostic model according to the expression quantity of the corresponding marker in the diagnostic model. As can be seen from the figure, the training set AUC value is 0.965 and the test set AUC value is 0.944. The result shows that the logistic regression model of the marker combination has a good sample classification effect on the test set. When the cut-off value of the two classification models is 0.66, the accuracy rate in the test set is 89.47%, the sensitivity is 90%, and the specificity is 88.89%.
Comparative example: ROC comparison of different diagnostic combination models
Referring to the method in example 1, single or partial proteins in the diagnostic combination are selected as diagnostic markers, the original dataset is randomly divided into a training set and a test set, and a logistic regression model is reconstructed. The results are shown in FIG. 11. As can be seen from the figure, AUC values of γgt, a0a5c2fx14+γgt, a0a1s5uz16+γgt, p01833+γgt, a0a5c2fx14+a0a1s5uz16+γgt, a0a5c2fx14+p01833+γgt, a0a1s5uz16+p01833+γgt and all marker combinations are 0.809, 0.913, 0.87, 0.922, 0.94, 0.934, 0.959, respectively. Thus, a more effective classification effect in early diagnosis of BA can be achieved with the marker combination provided in example 1.
Example 3
The present embodiment provides an electronic device including a processor and a memory, the memory having stored thereon a computer program executable on the processor, the processor implementing the following operations when the computer program is executed:
s1: obtaining expression levels from subjects A0A5C2FX14, A0A1S5UZ16, P01833 and ygt, and performing Z-score normalization;
s2: the p value is calculated according to the following formula:
wherein Z= -0.52-1.25 xA 0A5C2FX14-1.5 xA 0A1S5UZ16-1.98 xP 01833-2.24 xgamma GT;
the marker in the calculation formula of Z is a value processed in the expression level value S1 of the corresponding marker protein;
s3: comparing the calculated p value with a cutoff value of 0.66, and if the p value is larger than the cutoff value, indicating that the biliary tract blocking risk of the subject is high; if the p value is not greater than the cutoff value, the risk of biliary tract occlusion of the subject is low.
The present application has been described in detail with reference to the embodiments, but the present application is not limited to the embodiments described above, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present application. Furthermore, embodiments of the present application and features of the embodiments may be combined with each other without conflict.

Claims (10)

1. Use of a reagent for detecting a marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt for the preparation of a biliary latchup diagnostic product.
2. The use according to claim 1, wherein the reagent is used to detect the marker combination at the protein level.
3. The use according to claim 1, wherein the reagent is used for detecting the marker combination of any one of blood, serum, plasma.
4. The use according to claim 1, wherein the biliary tract occlusion diagnostic product is for diagnosing children of 0-4 months of age.
5. The use according to claim 4, wherein the biliary latchup diagnostic product is used for diagnosing children within 60 days of age.
6. The use according to any one of claims 1 to 5, wherein the biliary closure diagnostic product is for distinguishing intrahepatic bile deposition from biliary closure.
7. A biliary latchup diagnostic product comprising reagents for detecting a marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt.
8. A computer-readable storage medium storing computer-executable instructions for causing a computer to:
step 1: obtaining information on expression levels of a marker combination in a sample from a subject, the marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt;
step 2: mathematically correlating the expression levels to obtain a score; the score is used to indicate a risk of biliary closure in the subject.
9. The computer-readable storage medium of claim 8, wherein the computer-readable storage medium comprises,e is a natural base, Z is ∈K according to the formula->Calculated, a i Weights b are set for the ith marker in A0A5C2FX14, A0A1S5UZ16, P01833 and γGT i Is the expression level of the i-th marker in A0A5C2FX14, A0A1S5UZ16, P01833 and gamma GT, a 0 To set the intercept.
10. An electronic device comprising a processor and a memory, the memory having stored thereon a computer program executable on the processor, when executing the computer program, performing the operations of:
step 1: obtaining information on expression levels of a marker combination in a sample from a subject, the marker combination comprising A0A5C2FX14, A0A1S5UZ16, P01833 and ygt;
step 2: mathematically correlating the expression levels to obtain a score; the score is used to indicate a risk of biliary closure in the subject.
CN202311304291.9A 2023-10-09 2023-10-09 Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product Pending CN117288962A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311304291.9A CN117288962A (en) 2023-10-09 2023-10-09 Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311304291.9A CN117288962A (en) 2023-10-09 2023-10-09 Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product

Publications (1)

Publication Number Publication Date
CN117288962A true CN117288962A (en) 2023-12-26

Family

ID=89258392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311304291.9A Pending CN117288962A (en) 2023-10-09 2023-10-09 Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product

Country Status (1)

Country Link
CN (1) CN117288962A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5601986A (en) * 1994-07-14 1997-02-11 Amgen Inc. Assays and devices for the detection of extrahepatic biliary atresia
CN112748249A (en) * 2020-12-18 2021-05-04 深圳市绘云生物科技有限公司 Application of neonatal biliary tract occlusion diagnostic marker

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5601986A (en) * 1994-07-14 1997-02-11 Amgen Inc. Assays and devices for the detection of extrahepatic biliary atresia
CN112748249A (en) * 2020-12-18 2021-05-04 深圳市绘云生物科技有限公司 Application of neonatal biliary tract occlusion diagnostic marker

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MING FU, ET AL: "Proteomics Defines Plasma Biomarkers for the Early Diagnosis of Biliary Atresia", JOURNAL OF PROTEOME RESEARCH, 3 April 2024 (2024-04-03) *
王自能;宋元宗;郝虎;郭祖文;: "先天性胆道闭锁患儿心肌组织的电镜观察", 电子显微学报, no. 03, 15 June 2007 (2007-06-15), pages 221 - 224 *

Similar Documents

Publication Publication Date Title
Romick-Rosendale et al. Identification of urinary metabolites that distinguish membranous lupus nephritis from proliferative lupus nephritis and focal segmental glomerulosclerosis
US20080086272A1 (en) Identification and use of biomarkers for the diagnosis and the prognosis of inflammatory diseases
Liu et al. Alpha-fetoprotein level as a biomarker of liver fibrosis status: a cross-sectional study of 619 consecutive patients with chronic hepatitis B
JP2020515993A (en) Plasma-based protein profiling for early stage lung cancer diagnosis
JP7467447B2 (en) Sample quality assessment method
CN104204798A (en) Biomarkers for bladder cancer and methods using the same
DeMarshall et al. Autoantibodies as diagnostic biomarkers for the detection and subtyping of multiple sclerosis
CA2911204A1 (en) Biomarkers related to kidney function and methods using the same
JP7288283B2 (en) Urinary metabolite marker for pediatric cancer screening
CN105705652B (en) Method for assisting in the differential diagnosis of stroke
Bazarian et al. Accuracy of a rapid glial fibrillary acidic protein/ubiquitin carboxyl‐terminal hydrolase L1 test for the prediction of intracranial injuries on head computed tomography after mild traumatic brain injury
Fan et al. Urinary neutrophil gelatinase-associated lipocalin, kidney injury molecule-1, N-acetyl-β-D-glucosaminidase levels and mortality risk in septic patients with acute kidney injury
Durhan et al. Visual and software-based quantitative chest CT assessment of COVID-19: correlation with clinical findings
CN112748191A (en) Small molecule metabolite biomarker for diagnosing acute diseases, and screening method and application thereof
Sáez et al. Validation of CSF free light chain in diagnosis and prognosis of multiple sclerosis and clinically isolated syndrome: Prospective cohort study in Buenos Aires
Cho et al. Analytical and clinical performance of the Nanopia Krebs von den Lungen 6 assay in Korean patients with interstitial lung diseases
Terracciano et al. New strategy for the identification of prostate cancer: the combination of Proclarix and the prostate health index
JP6731957B2 (en) Method of diagnosing endometrial cancer
EP3271738B1 (en) Computerized optical analysis methods of mr (magnetic resonance) images for quantifying or determining liver lesions
JP7226732B2 (en) Cancer detection method, kit and device using urinary tumor marker
CA3115171A1 (en) A method for differentially diagnosing in vitro a bipolar disorder and a major depressive disorder
CN117288962A (en) Application of reagent for detecting marker combination in preparation of biliary tract locking diagnosis product
EP4337784A1 (en) Salivary metabolites are non-invasive biomarkers of hcc
Luther et al. The circulating proteomic signature of alcohol-associated liver disease
Messias et al. Usefulness of serum sodium levels as a novel marker for predicting acute appendicitis severity: a retrospective cohort study

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination