CN111444941A - Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data - Google Patents

Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data Download PDF

Info

Publication number
CN111444941A
CN111444941A CN202010137369.2A CN202010137369A CN111444941A CN 111444941 A CN111444941 A CN 111444941A CN 202010137369 A CN202010137369 A CN 202010137369A CN 111444941 A CN111444941 A CN 111444941A
Authority
CN
China
Prior art keywords
lung cancer
sample
serum
training set
normal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010137369.2A
Other languages
Chinese (zh)
Inventor
李艳坤
董汝南
马昕鹏
庞佳烽
景璟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Electric Power University
Original Assignee
North China Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Electric Power University filed Critical North China Electric Power University
Priority to CN202010137369.2A priority Critical patent/CN111444941A/en
Publication of CN111444941A publication Critical patent/CN111444941A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Urology & Nephrology (AREA)
  • Hematology (AREA)
  • Biomedical Technology (AREA)
  • Immunology (AREA)
  • Food Science & Technology (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biotechnology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention discloses a method for diagnosing early lung cancer by combining electrolytes in serum and proteomics data. The method comprises the steps of firstly, carrying out wavelet transformation processing on five traditional electrolytes and eight protein content data contained in the serum of a known lung cancer sample and a normal (non-lung cancer) sample, and taking the data as a training set. And then modeling the training set by adopting a linear discriminant analysis algorithm to obtain a characteristic vector E value corresponding to each variable. And taking the sample to be diagnosed which is processed by the same wavelet transform method as a prediction set, and multiplying the prediction set by the E to obtain a discrimination vector value, thereby carrying out the identification diagnosis of the lung cancer sample and the normal sample.

Description

Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data
Technical Field
The present invention relates to a method for diagnosing early lung cancer by combining electrolytes in serum and proteomic data.
Background
Early diagnosis and timely treatment of cancer are the most effective ways to improve survival rate of cancer patients. However, early diagnosis of cancer has been a problem that plagues medical staff and related researchers. Lung cancer is one of the most common and most lethal malignant tumors in China, and the incidence rate and the growth rate of the lung cancer are high and the lung cancer is the first of all malignant tumors. Early symptoms of lung cancer are not obvious, the incubation period is long, and early diagnosis and identification of lung cancer are always difficult problems in the medical field.
Tumor markers are substances that undergo qualitative or quantitative changes in the body fluids or tissues of a tumor patient, and may reflect the presence or characteristics of a tumor. Traditional tumor markers include alpha-fetoprotein (AFP), carcinoembryonic antigen (CEA), neuron-specific enolase (NSE), and the like. The detection of tumor markers is an important means for tumor diagnosis and treatment in clinical practice at home and abroad at present, but the tumor diagnosis method based on a single traditional protein marker still has many defects, and particularly, the sensitivity and specificity of diagnosis need to be improved. Therefore, the method has important significance for diagnosing by combining a plurality of tumor markers and searching for a new tumor marker with clinical significance.
Human serum contains abundant information and can provide important clues for diseases. Compared with tissue or cell detection, the method is convenient and fast. In addition, serum diagnosis is less harmful to the human body. The electrolyte content in serum can indicate various diseases. Electrolytes are involved in many important functions and metabolic activities in the body, play an important role in maintaining normal physiological activities, and many diseases cause or accompany disorders of serum electrolytes. At present, a great deal of research aiming at the traditional protein tumor marker is carried out, and the research on the relation between electrolyte and tumor is expected to be carried out systematically.
Disclosure of Invention
The invention aims to provide a method for diagnosing early lung cancer by combining electrolyte and proteomics data in serum, and provides a new way for quickly, efficiently and accurately diagnosing early lung cancer. More accurate results are obtained relative to identification by independent electrolyte or proteomic data sets.
The method comprises the steps of taking a patient group with confirmed lung cancer and a normal (contrast) group as training sets, modeling the training sets by using the content data of five traditional electrolytes and eight proteins in serum of the patient group as discrete wavelet transform processing, then adopting a linear discriminant analysis (L DA) method to obtain a characteristic vector value (Eigenvetor, E) corresponding to each variable, wherein the dimension of the E is n × 1, and n is the number of the variables.
The method comprises the following specific steps:
1. five conventional electrolytes (K) in the serum of a patient group and a normal group with known confirmed diagnosis of lung cancer+,Na+,Cl-,Ca2+,CO2Binding force) and eight protein (NSE, Cyfra21-1, CA153, CEA, CA199, CA125, AFP, HCG) content data are processed by discrete wavelet transform (db 10' wavelet base, decomposition scale is 3) to be used as a training set;
2. performing linear discriminant analysis on the training set data obtained in the step 1 to obtain a characteristic vector (E) value corresponding to each variable;
3. for unknown sample data (prediction set) X to be diagnosed containing the same five traditional electrolytes and eight proteinsnewPerforming discrete wavelet transform (DCT) similar to the training set, and Determining Vector (DV) corresponding to the transformed low-dimensional matrix by using DV ═ XnewAnd E, calculating. And (5) performing identification diagnosis of the cancer group and the normal group according to the difference of the DV values and the clustering result.
Drawings
FIG. 1 is a graph of the results of the identification and diagnosis of the combined electrolyte and protein data of lung cancer and normal samples of examples.
FIG. 2 is a graph showing the results of identification and diagnosis of electrolyte data of lung cancer and normal samples of examples.
FIG. 3 is a graph showing the results of identification and diagnosis of protein data of lung cancer and normal samples according to the examples.
Detailed Description
Example (b):
(1) the 252 Hospital of the people's liberation force of China provided 94 cases of serum of lung cancer patients and 58 cases of normal (non-lung cancer) serum, and five traditional electrolytes (K) contained in the serum were performed+、Na+、Cl-、Ca2+、CO2Binding force) and the content of eight proteins (NSE, Cyfra21-1, CA153, CEA, CA199, CA125, AFP, HCG).
(2) The serum index data of the lung cancer samples and the normal samples are randomly divided into a training set, a testing set and a prediction set, and are processed by discrete wavelet transform (db 10' wavelet basis, decomposition scale is 3).
(3) And (5) carrying out linear discriminant analysis on the training set samples to calculate a characteristic vector (E).
(4) Calculating DV values of the test set and the prediction set according to the E value, and performing discriminant diagnosis on the cancer group and the normal group according to the difference of the DV values and the clustering result, wherein the result is shown in figure 1. Fig. 2 and 3 are graphs showing the results of using the electrolyte and protein data alone, respectively, and it is apparent that the diagnostic result is best recognized using the combined data, and the accuracy reaches 100%, when compared with fig. 1.

Claims (1)

1. A method for diagnosing early stage lung cancer by combining electrolyte and proteomic data in serum, comprising the steps of:
step one, five traditional electrolytes (K) contained in a known lung cancer serum sample and a normal (non-lung cancer) serum sample are used+,Na+,Cl-,Ca2+,CO2Binding force) and eight kinds of protein (NSE, Cyfra21-1, CA153, CEA, CA199, CA125, AFP, HCG) are combined to perform discrete wavelet transform ("db 10" wavelet base, decomposition scale is 3) processing, and the discrete wavelet transform is used as a training set;
performing linear discriminant analysis on the training set to calculate a feature vector E value corresponding to the variable;
and step three, processing the sample to be diagnosed by adopting a wavelet transformation method which is the same as the training set to serve as a prediction set, multiplying the prediction set by the E to obtain a discrimination vector value corresponding to the sample, and identifying and diagnosing the lung cancer sample and the normal sample according to the size difference of the discrimination vector value.
CN202010137369.2A 2020-02-24 2020-02-24 Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data Pending CN111444941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010137369.2A CN111444941A (en) 2020-02-24 2020-02-24 Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010137369.2A CN111444941A (en) 2020-02-24 2020-02-24 Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data

Publications (1)

Publication Number Publication Date
CN111444941A true CN111444941A (en) 2020-07-24

Family

ID=71657429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010137369.2A Pending CN111444941A (en) 2020-02-24 2020-02-24 Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data

Country Status (1)

Country Link
CN (1) CN111444941A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140242608A1 (en) * 2013-02-28 2014-08-28 Seoul National University R&Db Foundation Composition for diagnosis of lung cancer and diagnosis kit for lung cancer
JP2018036264A (en) * 2016-09-02 2018-03-08 バイオインフラ生命科学株式会社BioInfra Life Science Inc. Composite biomarkers for diagnosing lung cancer in subject, lung cancer diagnostic kit using the same, method of using information on composite biomarkers, and computing system for implementing the same
CN108802389A (en) * 2018-06-04 2018-11-13 郭伟 A kind of kit for Early stage NSCLC diagnosis
CN109884302A (en) * 2019-03-14 2019-06-14 北京博远精准医疗科技有限公司 Lung cancer early diagnosis marker and its application based on metabolism group and artificial intelligence technology
CN110751983A (en) * 2019-11-14 2020-02-04 华北电力大学(保定) Method for screening characteristic mRNA (messenger ribonucleic acid) for diagnosing early lung cancer

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140242608A1 (en) * 2013-02-28 2014-08-28 Seoul National University R&Db Foundation Composition for diagnosis of lung cancer and diagnosis kit for lung cancer
JP2018036264A (en) * 2016-09-02 2018-03-08 バイオインフラ生命科学株式会社BioInfra Life Science Inc. Composite biomarkers for diagnosing lung cancer in subject, lung cancer diagnostic kit using the same, method of using information on composite biomarkers, and computing system for implementing the same
CN108802389A (en) * 2018-06-04 2018-11-13 郭伟 A kind of kit for Early stage NSCLC diagnosis
CN109884302A (en) * 2019-03-14 2019-06-14 北京博远精准医疗科技有限公司 Lung cancer early diagnosis marker and its application based on metabolism group and artificial intelligence technology
CN110751983A (en) * 2019-11-14 2020-02-04 华北电力大学(保定) Method for screening characteristic mRNA (messenger ribonucleic acid) for diagnosing early lung cancer

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王和勇: "《面向大数据的高维数据挖掘技术》", 31 May 2018 *
董汝南等: "基因表达数据解析及异位点探寻", 《2019中国化学会第十五届全国计算(机)化学学术会议》 *
魏霞等: "肿瘤标志物联合血清钙离子检测诊断早期肺癌骨转移37例", 《武警医学》 *

Similar Documents

Publication Publication Date Title
JP4950993B2 (en) System and method for comparing and editing metabolite data from multiple samples using a computer system database
EP3803415A1 (en) Protein biomarkers for identifying and treating aging skin and skin conditions
CN112763474B (en) Biomarker for predicting or detecting acute leukemia
CN101403740A (en) Mass spectrum model used for detecting liver cancer characteristic protein and preparation method thereof
CN111710372A (en) Exhaled air detection device and method for establishing exhaled air marker thereof
CN112183616B (en) Diagnostic marker and kit for diagnosis of glioma, screening method and construction method of glioma diagnostic model
WO2022194033A1 (en) Peripheral blood tcr marker for diffuse large b-cell lymphoma, and detection kit and use therefor
CN111965240A (en) Product, application and method for thyroid cancer related screening and assessment
CN110501443B (en) Novel biomarker for noninvasive identification/early warning of fatty liver cows
Wölfler et al. Mass spectrometry and serum pattern profiling for analyzing the individual risk for endometriosis: promising insights?
CN111444941A (en) Method for diagnosing early lung cancer by combining electrolyte in serum and proteomic data
CN111596054A (en) Tumor marker
CN116148482A (en) Device for breast cancer patient identification and its preparation and use
CN113970638B (en) Molecular marker for determining extremely early occurrence risk of gastric cancer and evaluating progression risk of gastric precancerous lesion and application of molecular marker in diagnostic kit
CN116030032A (en) Breast cancer analysis equipment, system and storage medium based on Raman spectrum data
CN114578060A (en) Method for using SAMHD1 protein as II-stage colorectal cancer curative effect prediction marker
CN110993092A (en) Method for identifying liver cirrhosis and liver cancer based on N-glucose fingerprint and big data algorithm
CN108823308A (en) Detect application and the kit of circMAN1A2 and LOC284454 reagent
CN108624692A (en) Gene marker and application thereof for the good pernicious examination of small pulmonary nodules
CN115792247B (en) Application of protein combination in preparation of thyroid papillary carcinoma risk auxiliary layering system
CN107991490A (en) The method of diagnosis of alzheimer's disease
CN118230955A (en) Model training method for depression detection, detection system, program, and storage medium
Zhu et al. LBA03-04 MRI-DERIVED TUMOR VOLUME AS A PREDICTOR OF BIOCHEMICAL RECURRENCE AND ADVERSE PATHOLOGY IN PATIENTS AFTER RADICAL PROSTATECTOMY: A PROPENSITY SCORE MATCHING STUDY
Matysiak et al. Proteomic and metabolomic strategy of searching for biomarkers of genital cancer diseases using mass spectrometry methods
Leemans et al. Screening of Breast Cancer from Sweat Samples Analyzed by 2-Dimensional Gas Chromatography-Mass Spectrometry: A Preliminary Study. Cancers 2023, 15, 2939

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200724