Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a biomarker for predicting whether liver cancer recurs, and the content of the biomarker is detected by adopting a mass spectrometry method to judge the risk of liver cancer recurrence; the method is simple and practical, and the sensitivity and specificity of the detection method can be better improved through the detection of various small molecules.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
the biomarker for predicting liver cancer recurrence comprises one or more protein molecules of LAS1L protein, CLTB protein, JAGN1 protein, ALYREF protein and HNRNPA3 protein.
A prognostic method for biomarkers predictive of diagnosis of liver cancer recurrence comprising the steps of: and detecting the protein expression quantity intensity value of LAS1L protein, CLTB protein, JAGN1 protein, ALYREF protein or HNRNPA3 protein in the sample to be detected by adopting an LC-MS/MS mass spectrometry.
Preferably, the sample is judged to be a relapsed patient when the protein expression intensity value of the LAS1L protein in the sample is more than 52085687.5, otherwise, the sample is judged to be a non-relapsed patient after operation, and the false positive rate is 22.9%.
Preferably, when the protein expression intensity value of the CLTB protein in the sample is more than 941197795, the patient is judged to be a relapse patient, otherwise the patient is judged to be a non-relapse patient after operation, and the false positive rate is 22.9%.
Preferably, when the protein expression intensity value of JAGN1 protein in the sample is more than 467852297, the patient is judged to be a relapse patient, otherwise, the patient is judged to be a non-relapse patient after operation, and the false positive rate is 8.6%.
Preferably, when the protein expression intensity value of the ALYREF protein in the sample is less than 149218152, the patient is judged to be a relapse patient, otherwise, the patient is judged to be a non-relapse patient after operation, and the false positive rate is 22.9%.
Preferably, the sample is judged to be a relapse patient when the protein expression intensity value of HNRNPA3 protein in the sample is less than 4386125605, otherwise, the sample is judged to be a non-relapse patient after operation, and the false positive rate is 20%.
A kit for diagnosing whether liver cancer recurs comprises one or more of a reagent for specifically detecting LAS1L protein, a reagent for specifically detecting CLTB protein, a reagent for specifically detecting JAGN1 protein, a reagent for specifically detecting ALYREF protein and a reagent for specifically detecting HNRNPA3 protein.
Preferably, the reagent for specifically detecting the LAS1L protein is a primer or probe that specifically recognizes the LAS1L protein nucleic acid; the reagent for specifically detecting the CLTB protein is a primer or a probe which specifically recognizes the CLTB protein nucleic acid; the reagent for specifically detecting the JAGN1 protein is a primer or a probe for specifically recognizing JAGN1 protein nucleic acid; the reagent for specifically detecting the ALYREF protein is a primer or a probe for specifically recognizing the ALYREF protein nucleic acid; the reagent for specifically detecting the HNRNPA3 protein is a primer or a probe for specifically recognizing HNRNPA3 protein nucleic acid; the reagents may be used to detect a tissue sample.
(III) advantageous effects
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention adopts LC-MS/MS mass spectrometry to detect the sample to be detected, and after mass spectrometry is carried out on a large number of clinical samples, 5 protein molecules are determined to have good detection benefit by the difference multiple (more than 2 or less than 0.5) of the corresponding molecular contents of the cancer tissues of patients who do not relapse after the liver cancer operation and the cancer tissues of patients who relapse after the liver cancer operation. The 5 protein molecules (namely LAS1L protein, CLTB protein, JAGN1 protein, ALYREF protein and HNRNPA3 protein) can be used as biomarkers for diagnosing liver cancer recurrence.
(2) The invention takes LAS1L protein, CLTB protein, JAGN1 protein, ALYREF protein and HNRNPA3 protein as biomarkers to diagnose whether the liver cancer recurs for a subject, is simple and easy to operate, has safe and effective diagnosis process, is easy to be accepted by patients, has unified diagnosis standard and has less influence of subjective factors.
(3) The method can provide a new treatment target and thought for the research and development of anti-liver cancer drugs in the future through the biomarkers detected by mass spectrometry.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Screening of biomarkers associated with liver cancer diagnosis
1. Experimental procedure
(1) Protein sample information
Sample preparation: 5 samples of cancer tissues obtained from patients with recurrent liver cancer and 35 samples of liver cancer tissues obtained from patients without recurrent liver cancer after surgery were obtained.
(2) Sample pretreatment
Extracting protein from a sample by adopting an SDT (4% (w/v) sodium dodecyl sulfate, 100mM Tris/HCl pH7.6, 0.1M dithiothreitol) cracking method, and then carrying out protein quantification by adopting a BCA method; taking a proper amount of protein from each sample, carrying out trypsin enzymolysis by using a filtered protein preparation (FAS) method, desalting peptide fragments by using C18 Cartridge, adding 40 mu L of 0.1% formic acid solution for redissolving after freeze-drying the peptide fragments, and quantifying the peptide fragments (OD 280).
The BCA method is used for protein quantification, and is characterized in that the protein concentration can be calculated according to the light absorption value, and the protein binds Cu under the alkaline condition2+Reduction to Cu+,Cu+Form a purple colored complex with BCA reagent, two molecules of BCA chelate a Cu+. And comparing the absorption value of the water-soluble compound at 562nm with a standard curve to calculate the concentration of the protein to be detected.
(3) LC-MS/MS data acquisition
Each sample was separated using a nanoliter flow rate HPLC liquid phase system Easy nLC.
Wherein: the buffer solution A was 0.1% formic acid aqueous solution, and the solution B was 0.1% formic acid acetonitrile aqueous solution (acetonitrile: 84%).
The column was equilibrated with 95% solution A, and the sample was applied to a loading column (Thermo scientific Acclaim PepMap100, 100. mu.m. by 2cm, NanoViper C18) by an autosampler and separated by an analytical column (Thermo scientific easy column, 10cm, ID 75. mu.m, 3. mu.m, C18-A2) at a flow rate of 300 nL/min.
After chromatographic separation, the sample is subjected to mass spectrometry by using a Q-exact mass spectrometer. The detection method is positive ion, the scanning range of the parent ion is 300-1800 m/z, the first-order mass spectrum resolution is 70,000 at 200m/z, the AGC (automatic gain control) target is 1e6, the Maximum IT is 50ms, and the Dynamic exclusion time (Dynamic exclusion) is 60.0 s. The mass-to-charge ratio of the polypeptide and the polypeptide fragments was collected as follows: 20 fragment patterns (MS 2 scan) were collected after each full scan (full scan), MS2 Activation Type was HCD, Isolation window was 2m/z, secondary mass resolution 17, 500 at 200m/z, Normalized fusion Energy was 30eV, and Underfill was 0.1%.
(4) Protein identification and quantitative analysis
The RAW data of mass spectrometry is RAW file, and the software MaxQuant software (version number 1.5.3.17) is used for library checking and quantitative analysis.
iBAQ Intensity is the amount of protein expressed in sample X based on the iBAQ algorithm, and is approximately equal to the absolute concentration of protein in that sample. LFQ Intensity is the relative protein expression of sample X based on the LFQ algorithm, and is often used for group comparisons. One of them is generally selected by Labelfree as a quantitative result.
IBAQ (Intensity-based absorbance quantification) and LFQ belong to two different protein quantification algorithms provided by Maxquant software.
iBAQ is generally used for absolute quantification of proteins in samples, the main algorithm being based on the ratio of the sum of the intensities of the peptides identified for the protein to the theoretical number of peptides.
LFQ is generally used for pairwise quantitative comparisons between groups, the main algorithm being pair-wise correction through peptide and protein multilayers. This patent uses LFQ for protein quantification.
(5) Statistical analysis
Carrying out ratio calculation and statistical analysis on data which conform to at least two non-null values in the same group of the three-time repeated data, wherein the data comprise LFQ or iBAQ strength value ratios and P-values of all comparison groups; and (5) preliminarily screening out the difference foreign matters among the groups.
Whether the differential protein substance has significance is further verified according to P-value. Selecting a protein which has multidimensional statistical analysis of Fold change >2 or <0.5 and is considered that the content of the protein has obvious Fold difference between the cancer tissue and the tissue beside the cancer, and screening out the protein with univariate statistical analysis P value <0.05 as the protein with significant difference; thereby obtaining the differential protein molecules. Then, SPSS software is used for making a ROC curve of the differential protein, and the area under the curve (AUC) is calculated, so that the diagnostic value of the differential protein is judged. The specific judgment method is that the area under the AUC line is more than 0.7, P is less than 0.05, and the threshold standard (cut off value) when the John's index is maximum is used as the threshold standard for judging whether the tumor is present or not (if the multiple is more than 2, the tumor detection is positive if the multiple is more than the threshold, and if the multiple is less than 0.5, the liver cancer detection is positive if the multiple is less than the threshold), thereby obtaining higher sensitivity and specificity.
(6) Bioinformatics analysis
(GO) functional Annotation
The GO Annotation of a target protein set by using Blast2GO can be roughly summarized into four steps of sequence alignment (Blast), GO entry extraction (Mapping), GO Annotation (Annotation) and InterProScan supplementary Annotation (Annotation).
② KEGG pathway notes
The target protein set was annotated with the KEGG pathway using kaas (KEGG automated Annotation server) software.
Enrichment analysis of GO annotations and KEGG annotations
And comparing the distribution of each GO classification or KEGG channel in the target protein set and the total protein set by adopting Fisher's Exact Test, and performing GO annotation or KEGG channel annotation enrichment analysis on the target protein set.
Protein clustering analysis
First, quantitative information of a target protein set is normalized (normalized to a (-1, 1) interval). Then, two dimensions of the expression amounts of the sample and the protein were simultaneously classified using a Complexheatmap R package (R Version 3.4) (distance algorithm: Euclidean, ligation: Average linkage), and a hierarchical clustering heat map was generated.
Analysis of protein interaction network
The interaction relationship between the target proteins was found based on the information in the STRING database, and the interaction network was generated and analyzed using the Cytoscape software (version number: 3.2.1).
(7) Differentially expressed protein screening
Differentially expressed proteins were screened for numbers of differentially expressed proteins for each comparative group using criteria with fold change greater than 2.0 fold (up-regulation greater than 2 fold or down-regulation less than 0.5) and P value less than 0.05.
(8) Basic principle of experiment
Unlabeled quantitative proteomics (Label-free) technology has become an important method of mass spectrometry in recent years. There are two main quantitative principles of the Label-free technology: firstly, the development of non-labeled quantitative methods of spectra counts is earlier, and a plurality of quantitative algorithms are formed, but the core principle is that the identification result of MS2 is used as the basis of quantification, and the difference of the various methods lies in the correction of high-throughput data by a later algorithm; the principle of the second unlabeled quantification method is based on MS1, and the integral of each peptide fragment signal on LCMS chromatography is calculated. The Maxquant algorithm adopted by the invention is based on the second principle.
2. Results of the experiment
By mass spectrum data analysis and comparison of protein molecules of liver cancer tissues of patients with relapse and liver cancer tissues of patients without relapse (no relapse after operation), 5 protein molecules are obtained and can be used as biomarkers of liver cancer relapse.
In order to evaluate the diagnosis efficiency of the protein expression intensity value of the protein molecule on the liver cancer recurrence, the invention adopts ROC curve analysis, and AUC is the area under the ROC curve, is the most commonly used parameter for evaluating the characteristic of the ROC curve, and is also an important test accuracy index. If the AUC is below 0.7, the diagnosis accuracy is low; the AUC is more than 0.7, so that the requirement of clinical diagnosis can be met.
Specific results and analyses were as follows:
(1) the LC-MS/MS mass spectrometry is adopted to detect that the LAS1L protein has difference between the liver cancer recurrent tissue and the postoperative non-recurrent tissue.
The research shows that the LAS1L protein is significantly up-regulated by 11.68 times in the liver cancer recurrence sample, and the p value is less than 0.05.
As can be seen from FIG. 1, the AUC of LAS1L protein is 0.806>0.7, which indicates that LAS1L protein has a good effect of determining whether liver cancer has recurred or not, and can be used as a biomarker for determining whether liver cancer has recurred or not.
When the cut off value of LAS1L protein was 52085687.5, the sensitivity was 80% and the specificity was 77.1%. When the individual detection was performed, the protein expression level intensity value of LAS1L protein was more than 52085687.5, and the patient was judged to be a relapsed patient, otherwise, the patient was judged to be a non-relapsed patient (false positive rate: 22.9%).
As can be seen from FIG. 2, the liver cancer recurrent tissue samples were mainly distributed above the detection threshold (solid line in FIG. 2), and the postoperative non-recurrent tissue samples were mainly distributed below the detection threshold, indicating that the difference in the protein expression level between the liver cancer recurrent tissue and the postoperative non-recurrent tissue was large, and the detection threshold had a good detection effect.
In conclusion, the LAS1L protein can be used as a biomarker for liver cancer recurrence.
(2) The difference of CLTB protein in the liver cancer recurrent tissue and the postoperative non-recurrent tissue is detected by adopting an LC-MS/MS mass spectrometry.
The research finds that the CLTB protein is remarkably up-regulated by 6.68 times in a liver cancer recurrence sample, and the p value is less than 0.05.
As can be seen from FIG. 3, the AUC of CLTB protein is 0.806>0.7, which indicates that CLTB protein has a good judgment effect and can be used as a biomarker for determining whether liver cancer has recurred.
When the protein expression level intensity value of the CLTB protein was 941197795, the sensitivity was 80% and the specificity was 77.1%. When the individual detection is carried out, the CLTB protein has the protein expression intensity value of more than 941197795, and is judged as a relapse patient, otherwise, the CLTB protein is judged as a non-relapse patient (the false positive rate is 22.9%).
As can be seen from fig. 4, the liver cancer recurrent tissue samples are mainly distributed above the detection threshold (solid line in fig. 4), and the postoperative non-recurrent tissue samples are mainly distributed below the detection threshold, which indicates that the difference between the protein expression level intensity values of the liver cancer recurrent tissue and the postoperative non-recurrent tissue is large, and the detection threshold has a good detection effect.
In summary, the CLTB protein can be used as a biomarker for recurrence of liver cancer.
(3) The LC-MS/MS mass spectrometry is adopted to detect that the JAGN1 protein has difference between the liver cancer recurrent tissue and the postoperative non-recurrent tissue.
The JAGN1 protein is found to be significantly up-regulated by 2.45 times in a liver cancer recurrence sample, and the p value is less than 0.05.
As shown in FIG. 5, the AUC of JAGN1 protein was 0.840>0.7, indicating that JAGN1 protein has a good effect of determining the recurrence of liver cancer.
When the protein expression level intensity value of JAGN1 protein was 467852297, the sensitivity was 80% and the specificity was 91.4%. When individual detection is carried out, the JAGN1 protein is judged to be a relapse patient when the protein expression intensity value is more than 467852297, otherwise, the JAGN1 protein is judged to be a non-relapse patient (the false positive rate is 8.6%).
As can be seen from fig. 6, the liver cancer recurrent tissue samples were mainly distributed above the detection threshold (solid line in fig. 6), and the postoperative non-recurrent tissue samples were mainly distributed below the detection threshold, indicating that the difference in the protein expression level intensity values between the liver cancer recurrent tissue and the postoperative non-recurrent tissue was large, and the detection threshold had a good detection effect.
In conclusion, the JAGN1 protein can be used as a biomarker for liver cancer recurrence.
(4) The differences of the ALYREF protein in the recurrent tissues and the non-recurrent tissues are detected by adopting an LC-MS/MS mass spectrometry.
The research finds that the ALYREF protein is down-regulated by 0.19 times in the significance of the liver cancer recurrence sample, and the p value is less than 0.05.
As can be seen from fig. 7, AUC of the ALYREF protein is 0.937>0.7, which indicates that the ALYREF protein has a good determination effect and can be used as a biomarker for determining whether liver cancer has relapsed.
When cut off value of the ALYREF protein is 149218152, the sensitivity is 100%, and the specificity is 77.1%. When the individual detection is carried out, the protein expression intensity value of the ALYREF protein is less than 149218152, the patient is judged to be a relapse patient, otherwise, the patient is judged to be a non-relapse patient after the operation (the false positive rate is 22.9%).
As can be seen from fig. 8, the liver cancer recurrent tissue samples were mainly distributed below the detection threshold (solid line in fig. 8), and the non-recurrent tissue samples were mainly distributed above the detection threshold, indicating that the intensity values of the protein expression levels of the liver cancer recurrent tissue and the non-recurrent tissue are greatly different, and the detection threshold has a good detection effect.
In summary, the ALYREF protein can be used as a biomarker for liver cancer recurrence.
(5) The differences between the HNRNPA3 protein in the relapsed tissue and the non-relapsed tissue were detected by LC-MS/MS mass spectrometry.
The research finds that the HNRNPA3 protein is regulated by 0.49 times in the significance of the liver cancer recurrence sample, and the p value is less than 0.05.
As can be seen from FIG. 9, the AUC of HNRNPA3 protein is 0.920>0.7, which indicates that HNRNPA3 protein has a good effect of determining and can be used as a biomarker for determining whether liver cancer has recurred.
When the protein expression intensity value of HNRNPA3 protein is 4386125605, the sensitivity is 100% and the specificity is 80%. When the individual detection is carried out, the protein expression intensity value of HNRNPA3 protein is less than 4386125605, the patient is judged to be a relapse patient, otherwise, the patient is judged to be a non-relapse patient after the operation (the false positive rate is 20%).
As can be seen from fig. 10, the liver cancer recurrent tissue samples were mainly distributed below the detection threshold (solid line in fig. 10), and the non-recurrent tissue samples were mainly distributed above the detection threshold, indicating that the intensity values of the protein expression levels of the liver cancer recurrent tissue and the non-recurrent tissue are greatly different, and the detection threshold has a good detection effect.
In conclusion, the HNRNPA3 protein can be used as a biomarker for liver cancer recurrence.
(6) The application of the combination of the ALYREF protein and the HNRNPA3 protein in preparing the early liver cancer diagnostic kit.
The invention also provides a diagnosis method of liver cancer, which comprises the following steps: the numerical value of P (recurrence probability after liver cancer operation) is calculated by adopting binary logistic regression analysis, and the formula obtained after binary logistic regression of SPSS software is as follows:
wherein a is the protein expression intensity value of ALYREF protein, and b is the protein expression intensity value of HNRNPA3 protein; if the detection P (recurrence probability after liver cancer) is more than 0.0694114, the patient is judged to be a recurrence patient, otherwise, the patient is judged to be a non-recurrence patient.
As shown in fig. 11, AUC of the combined protein is 0.977>0.7, which indicates that the combined protein has a better judgment effect, i.e. the combined protein can be used as a biomarker for determining whether liver cancer has relapsed.
When the protein expression intensity value of the combined protein was 0.0694114, the sensitivity was 100% and the specificity was 88.6%. When the individual detection is carried out, the protein expression intensity value of the combined protein is greater than 0.0694114, the patient is judged to be a relapse patient, otherwise, the patient is judged to be a non-relapse patient after the operation (the false positive rate is 11.4%).
As can be seen from fig. 12, the liver cancer recurrent tissue samples were mainly distributed above the detection threshold (solid line in fig. 12), and the non-recurrent tissue samples were mainly distributed above the detection threshold, indicating that the difference in the protein expression level intensity values between the liver cancer recurrent tissue and the non-recurrent tissue is large, and the detection threshold detection effect is good.
In view of the above results, the combination protein consisting of the ALYREF protein and HNRNPA3 protein can be used as a biomarker for recurrence of liver cancer.