CN114743593B - Construction method of prostate cancer early screening model based on urine, screening model and kit - Google Patents

Construction method of prostate cancer early screening model based on urine, screening model and kit Download PDF

Info

Publication number
CN114743593B
CN114743593B CN202210660400.XA CN202210660400A CN114743593B CN 114743593 B CN114743593 B CN 114743593B CN 202210660400 A CN202210660400 A CN 202210660400A CN 114743593 B CN114743593 B CN 114743593B
Authority
CN
China
Prior art keywords
urine
gene
fragment
sample
prostate cancer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210660400.XA
Other languages
Chinese (zh)
Other versions
CN114743593A (en
Inventor
曹善柏
丁峰
刘磊
孙宏
周涛
王旺
楼峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Xiangxin Biotechnology Co ltd
Tianjin Xiangxin Medical Instrument Co ltd
Beijing Xiangxin Biotechnology Co ltd
Original Assignee
Tianjin Xiangxin Biotechnology Co ltd
Tianjin Xiangxin Medical Instrument Co ltd
Beijing Xiangxin Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Xiangxin Biotechnology Co ltd, Tianjin Xiangxin Medical Instrument Co ltd, Beijing Xiangxin Biotechnology Co ltd filed Critical Tianjin Xiangxin Biotechnology Co ltd
Priority to CN202210660400.XA priority Critical patent/CN114743593B/en
Publication of CN114743593A publication Critical patent/CN114743593A/en
Application granted granted Critical
Publication of CN114743593B publication Critical patent/CN114743593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Abstract

The application relates to the technical field of biotechnology, and particularly discloses a construction method, a screening model and a kit for an early prostate cancer screening model based on urine. The construction method specifically comprises the following steps: extracting urine of the health sample and the cancer sample and sequencing to obtain a sequencing result; analyzing the sequencing result to obtain prostate cancer related mutation markers of the healthy sample and the cancer sample; the prostate cancer related mutant marker comprises a CNV event, a motif type, a fragment distribution characteristic and a methylation characteristic; and (3) carrying out standardization treatment on the marker data of the prostate cancer related mutation marker, and establishing a prostate cancer early stage screening model. And constructing an early screening model of the prostatic cancer. The screening model provided by the application can realize high-precision early screening of the prostatic cancer by a non-invasive method.

Description

Construction method of prostate cancer early screening model based on urine, screening model and kit
Technical Field
The application relates to the technical field of biotechnology, in particular to a construction method, a screening model and a kit for an early prostate cancer screening model based on urine.
Background
Prostate Cancer (PCa) is a malignant tumor that has a high incidence in older men, and the incidence increases with age. The age of onset is low before age 55, gradually increases after age 55, and the incidence increases with age, with a peak age of 70-80 years. However, patients with familial hereditary prostate cancer develop an earlier age, with 43% of patients aged 55 years or less. Prostate cancer is often asymptomatic in the early stages, so it is difficult for patients to find the condition.
Currently, the clinical diagnosis of prostate cancer mainly relies on digital rectal examination, serum PSA, transrectal prostate ultrasound, etc. The detection sensitivity and specificity of serum PSA are low, the digital rectal examination depends on the experience of doctors excessively, and especially, the digital rectal examination and transrectal prostate ultrasound are highly invasive and are not friendly to patients.
Disclosure of Invention
The application provides a construction method, a screening model and a kit for an early prostate cancer screening model based on urine. The early prostate cancer screening method achieves high-precision early prostate cancer screening through a non-invasive method, and has high sensitivity and specificity.
In a first aspect, the present application provides a method for constructing a prostate cancer early screening model based on urine, which adopts the following technical scheme:
a construction method of a prostate cancer early-stage screening model based on urine specifically comprises the following steps:
extracting urine of the health sample and the cancer sample and sequencing to obtain a sequencing result;
analyzing the sequencing result to obtain prostate cancer related mutation markers of the healthy sample and the cancer sample; the prostate cancer related mutant marker comprises a CNV event, a motif type, a fragment distribution characteristic and a methylation characteristic;
and (3) carrying out standardization processing on the marker data of the prostate cancer related mutation marker to construct and establish a prostate cancer early screening model.
The sequencing result of the urine sample of the prostate cancer patient is used for analyzing 4 dimensions of CNV events, motif types, fragment distribution characteristics and methylation characteristics, so that a model capable of early screening of the prostate cancer is constructed. According to the early prostate cancer screening model obtained by the construction method, high-precision early prostate cancer screening can be realized by a non-invasive method, and high sensitivity and specificity are realized.
Preferably, the CNV event analysis method specifically includes the following steps:
obtaining a bam file from the sequencing result of the urine, and constructing baseline after standardized treatment; processing the information in the baseline, and calculating the expected coverage and median segment variance of each bin; then, distinguishing the segments by using a CBS method to obtain the average log2ratio of each segment; judging whether the segment has the CNV event according to the average log2ratio, and accumulating the total length of the segment with the CNV event on each autosome in the health sample and the cancer sample according to the judgment result;
learning the analysis method of the CNV event by using an SVM, and performing statistical analysis processing; and (3) by judging that the threshold value p is less than 0.0001, taking the chromosome arm with the total length of the segment of the CNV event in the tumor sample being significantly higher than that of the segment of the CNV event in the healthy sample as a marker, and finally determining the marker as the CNV event.
Preferably, the mark marker of the CNV event includes 2p, 3p, 5q, 6q, 7p, 8q, 13q.
Preferably, the method for analyzing the motif type specifically comprises the following steps:
obtaining a bam file from the sequencing result of the urine, counting the number of terminal sequence motif types and the occurrence frequency of all 4BP, and then calculating the proportion of each motif type in all the motif types;
learning the provided motif type analysis method by using an SVM (support vector machine), performing statistical analysis processing, and determining a marker of the motif type; and selecting a proper motif type as a mark marker according to the threshold p <0.0001, and finally determining the mark marker as the motif type.
Preferably, the logo marker of the motif types includes 53 motif types.
Preferably, the motif-type marker includes ATAC, ATAT, ATAG, ATGA, ATGT, AGCC, AGTC, TACG, TTAT, TTCA, TGAG, TGGG, TGGC, TGAC, TGCG, TCAC, TCAG, TCTC, TCTA, TCCG, TCCC, GAAC, GAAT, GATC, GAGC, GACG, GACC, GTAT, GGCT, GGCG, GGCC, GCAG, GCTC, GCCC, CATC, CACG, CTAA, CTAT, CTAG, CTAC, CTTA, CTTT, CTTG, CTTC, CTGC, CTGT, CTCT, CGAG, CGGT, CGCG, CCAT.
Preferably, the method for analyzing the distribution characteristics of the fragments specifically comprises the following steps:
obtaining a bam file from the sequencing result of the urine, determining the threshold value of the large fragment and the small fragment by using variance analysis, and obtaining the specific value of the threshold value, wherein the gene fragment is larger than the threshold value and is a long fragment, and the gene fragment is smaller than the threshold value and is a short fragment; dividing the genome into intervals with the length of 5Mb, counting the number of long fragments and the number of short fragments in a comparison in each interval, and calculating the ratio;
and comparing the ratio of the analysis results of the healthy urine samples with the ratio of the analysis results of the tumor urine samples, counting the long-segment distribution characteristics and the short-segment distribution characteristics with different ratios as mark markers, and finally determining the mark markers of the segment distribution characteristics.
Preferably, the marker of the fragment distribution characteristics is a long fragment distribution characteristic and a short fragment distribution characteristic on all autosomes.
Preferably, the method for analyzing the methylation characteristics specifically comprises the following steps:
performing differential analysis through tissue data of TCGA according to a urine sequencing result, selecting a methylated differential gene and an internal reference gene, quantifying the methylated differential gene of a detection sample by using methylation specific polymerase chain reaction, calculating a methylation value of the methylated differential gene by using the internal reference gene, and performing negative or positive on a differential gene target;
if any difference gene is positive by the target, the urine sample is a cancer sample; if all the differential genes are negative by the target, the urine sample is a healthy sample.
Preferably, the differential genes screened by the methylation characteristics comprise a HOXD3 gene, a RASSF1 gene, a GSTP1 gene, a CDH1 gene, a FOXP1 gene and an APC gene.
In a second aspect, the present application provides an early stage screening model for prostate cancer, which is constructed by using the above construction method.
The screening model provided by the application can accurately screen the prostate cancer sample in the early stage, the accuracy is as high as 94.33%, and the screening model has a good clinical application value.
In a third aspect, the present application provides a kit, which adopts the following technical scheme:
a kit for extracting and detecting urine in the construction method.
The kit comprises a urine supernatant cfDNA extraction part, a genome sequencing part, a urine sediment cell genome DNA extraction part and a gene characteristic region CpG methylation detection part.
The cfDNA extraction part specifically comprises the following reagents: proteinase K, sample lysine Buffer, cfDNA lysine/Binding Solution, magnetic Nanoparticles, apostle MiniMax cfDNA Wash Solution, apostle MiniMax cfDNA 2nd Wash Solution working Solution and Apostle MiniMax cfDNA Solution with the concentration of 20 mg/mL.
The Apostle MiniMax cfDNA 2nd Wash Solution working Solution is prepared from Apostle MiniMax cfDNA 2nd Wash Solution and absolute ethyl alcohol, and the volume ratio of the Apostle MiniMax cfDNA 2nd Wash Solution to the absolute ethyl alcohol is 4:1.
the genome sequencing part specifically comprises the following reagents: nuclear Free Water, end Repair & A-Tailing Buffer, end Repair & A-Tailing Enzyme Mix, DNA Ligation linker, PCR-grade Water, ligation Buffer, DNA Ligation Enzyme, KAPA HiFi HotStart ReadyMix, (P5 + P7) Primer Mix.
The extraction part of the genomic DNA of the urine sediment cells comprises the following reagents: buffer GA, proteinase K with the concentration of 10mg/mL, buffer GB, buffer GD and Buffer PW.
The CpG methylation detection part of the gene characteristic region comprises the following reagents: ACTB _ Primer _ F, ACTB _ Primer _ R, ACTB _ Probe, APC _ Primer _ F, APC _ Primer _ R, APC _ Probe, GSTP1_ Primer _ F, GSTP1_ Primer _ R, GSTP1_ Probe, lightning Conversion Reagent, M-Binding Buffer, M-Wash Buffer, L-Desyphonation Buffer, M-agitation Buffer, nuclease-Free Water, 5% methylation rate DNA, blood cell DNA, qMSP reaction system.
In summary, the present application has the following beneficial effects:
according to the method, the sequencing result of the urine is analyzed, the CNV event situation, the motif type situation, the long and short fragment feature distribution situation and the methylation situation are analyzed from four dimensions of the CNV event, the motif type, the fragment distribution feature and the methylation feature, the indexes of different dimensions are subjected to centralized processing through machine learning algorithm analysis such as a support vector machine and a random forest method, a training set is constructed to train the model, the early prostate cancer screening model is obtained, and the performance and the accuracy of the early prostate cancer screening model are further tested. Finally, early screening models for prostate cancer can be used to classify and screen individual specimens or multiple specimens for cancer and non-cancer.
Drawings
FIG. 1 is a ROC plot for early stage screening for prostate cancer based on urine in example 2 of the present application.
Detailed Description
The application provides a construction method of a prostate cancer early screening model based on urine, which specifically comprises the following steps: extracting urine of the health sample and the cancer sample and sequencing to obtain a sequencing result;
analyzing the sequencing result to obtain prostate cancer related mutation markers of the healthy sample and the cancer sample; the prostate cancer related mutant marker comprises a CNV event, a motif type, a fragment distribution characteristic and a methylation characteristic;
and (3) carrying out standardization treatment on the marker data of the prostate cancer related mutation marker, and constructing and establishing a prostate cancer early screening model.
The CNV event analysis method specifically comprises the following steps:
obtaining a bam file from the sequencing result of the urine, and constructing baseline after standardized treatment; processing the information in the baseline, and calculating the expected coverage and the median segment variance of each bin; then, distinguishing the segments by using a CBS method to obtain the average log2ratio of each segment; judging whether the segment has the CNV event according to the average log2ratio, and accumulating the total length of the segment with the CNV event on each autosome in the health sample and the cancer sample according to the judgment result;
learning the analysis method of the CNV event by using an SVM, and performing statistical analysis processing; and (3) by judging that the threshold value p is less than 0.0001, taking the chromosome arm with the total length of the segment of the CNV event in the tumor sample being significantly higher than that of the segment of the CNV event in the healthy sample as a marker, and finally determining the marker as the CNV event.
The method for analyzing the motif type specifically comprises the following steps:
obtaining a bam file from the sequencing result of the urine, counting the number of terminal sequence motif types and the occurrence frequency of all 4BP, and then calculating the proportion of each motif type in all the motif types;
learning the analysis method of the motif type by using an SVM (support vector machine), performing statistical analysis processing, and determining a marker of the motif type; and selecting a proper motif type as a mark marker according to the threshold p <0.0001, and finally determining the mark marker as the motif type.
The analysis method of the fragment distribution characteristics specifically comprises the following steps:
obtaining a bam file from the sequencing result of the urine, determining the threshold value of the large fragment and the small fragment by using variance analysis, and obtaining the specific value of the threshold value, wherein the gene fragment is larger than the threshold value and is a long fragment, and the gene fragment is smaller than the threshold value and is a short fragment; dividing the genome into intervals with the length of 5Mb, counting the number of long fragments and the number of short fragments in a comparison in each interval, and calculating the ratio;
and comparing the ratio of the analysis results of the healthy urine samples with the ratio of the analysis results of the tumor urine samples, counting the long-segment distribution characteristics and the short-segment distribution characteristics with different ratios as mark markers, and finally determining the mark markers of the segment distribution characteristics.
The method for analyzing the methylation characteristics specifically comprises the following steps:
performing differential analysis through tissue data of TCGA according to a urine sequencing result, selecting a methylated differential gene and an internal reference gene, quantifying the methylated differential gene of a detection sample by using methylation specific polymerase chain reaction, calculating a methylation value of the methylated differential gene by using the internal reference gene, and performing negative or positive on a differential gene target;
if any difference gene is positive by the target, the urine sample is a cancer sample; if all the differential genes are negative by the target, the urine sample is a healthy sample.
The application also provides an early prostate cancer screening model constructed by the construction method. The screening model provided by the application can accurately screen the prostate cancer sample in the early stage, the accuracy is as high as 94.33%, and the screening model has a good clinical application value.
The application also provides a kit for extracting and detecting the urine supernatant cfDNA in the construction method. The kit comprises a urine supernatant cfDNA extraction part, a genome sequencing part, a urine sediment cell genome DNA extraction part and a gene characteristic region CpG methylation detection part.
The present application is described in further detail below with reference to preparation examples and examples.
Preparation example
The present preparation provides a kit. The kit is used for extracting the urine supernatant cfDNA (Cell free DNA, cfDNA) in a urine sample, performing low-depth sequencing on a specific region of the urine supernatant cfDNA, extracting the urine sediment Cell genome DNA, and detecting the CpG methylation of a gene characteristic region. The urine sample may be a cancer sample, specifically a prostate cancer sample in this embodiment, or a health sample.
1. The kit specifically comprises a urine supernatant cfDNA extraction part, a genome sequencing part, a urine sediment cell genome DNA extraction part and a gene characteristic region CpG methylation detection part.
The reagent specifically comprises the following components: (1) The extraction of the urine supernatant cfDNA specifically comprises the following reagents:
proteinase K, sample lysine Buffer, cfDNA lysine/Binding Solution, magnetic Nanoparticles, apostle MiniMax cfDNA Wash Solution, apostle MiniMax cfDNA 2nd Wash Solution working Solution and Apostle MiniMax cfDNA Solution with the concentration of 20 mg/mL.
(2) The genome sequencing part was used for low depth sequencing of specific regions of urine supernatant cfDNA, comprising the following reagents:
nucleic Free Water, end Repair & A-Tailing Buffer, end Repair & A-Tailing Enzyme Mix, DNA Ligation linker, PCR-grade Water, ligation Buffer, DNA Ligation Enzyme, KAPA HiFi HotStart ReadyMix, (P5 + P7) Primer Mix.
(3) The extraction part of the genomic DNA of the urine sediment cells comprises the following reagents:
buffer GA, proteinase K with the concentration of 10mg/mL, buffer GB, buffer GD and Buffer PW.
(4) The CpG methylation detection part of the gene characteristic region comprises the following reagents:
ACTB _ Primer _ F, ACTB _ Primer _ R, ACTB _ Probe, APC _ Primer _ F, APC _ Primer _ R, APC _ Probe, GSTP1_ Primer _ F, GSTP1_ Primer _ R, GSTP1_ Probe, lightning Conversion Reagent, M-Binding Buffer, M-Wash Buffer, L-depletion Buffer, M-precipitation Buffer, nuclease-Free Water, 5% methylation rate DNA, blood cell DNA, qMSP reaction system.
2. In the kit, the extraction method of cfDNA specifically comprises the following steps:
(1) Collection and transport of urine samples
Collecting urine; the collection mode of urine is for using well to gather 1 package for the urine supernatant, obtains the urine sample.
Transport environment and time: the transportation temperature is 2-8 ℃; the transport time does not exceed 3d.
(2) Extraction of urine supernatant cfDNA
1) Urine pretreatment: placing the urine sample obtained in the step (1) in a normal-temperature centrifuge, centrifuging at 1600rpm for 10min, and transferring the supernatant obtained by centrifuging into a 5mL centrifuge tube by using a pipettor; the above centrifugation was repeated to obtain a supernatant of urine. In this example, the extraction of cfDNA was performed using a 4mL urine supernatant as an example.
2) The specific operation flow of cfDNA extraction is as follows:
a. adding 320 mu L of proteinase K (with the concentration of 20 mg/mL) into the urine supernatant obtained in the step 1); after the mixture is inverted and mixed evenly, 400 mu L of Sample lysine Buffer is added; mixing, and incubating in metal bath at 60 deg.C for 30min;
b. adding 5mL of cfDNA lysine/Binding Solution and 60 mu L of Magnetic Nanoparticles into a 15mL centrifuge tube in sequence; continuing to add the system after the incubation in the step a into a 15mL centrifuge tube, placing the 15mL centrifuge tube into a four-dimensional blending instrument, and performing reverse incubation for 10min at room temperature;
c. putting the 15mL centrifuge tube subjected to the reverse incubation in the step b into a normal-temperature centrifuge, centrifuging for 2min at 1600rpm, and then putting the centrifuge tube on a magnetic frame to adsorb magnetic beads; after the solution is clear and transparent, absorbing and discarding the supernatant;
d. c, continuously adding 1mL of Apostle MiniMax cfDNA Wash Solution into the 15mL of centrifuge tube in the step c, shaking and uniformly mixing, and transferring the system into a 1.5mL centrifuge tube which is placed on a magnetic rack in advance; after the solution in the 1.5mL centrifuge tube is colorless and clear, transferring the supernatant to the 15mL centrifuge tube to rinse the magnetic beads again, and transferring the magnetic bead suspension subjected to secondary rinsing to the 1.5mL centrifuge tube to adsorb the magnetic beads; when the solution in the 1.5mL centrifugal tube is clear and transparent, absorbing and discarding the supernatant;
e. adding 1mL of Apostle MiniMax cfDNA Wash Solution into the 1.5mL of centrifuge tube treated in the step d, shaking and rinsing for 30s, and instantly centrifuging, and placing the 1.5mL of centrifuge tube on a magnetic frame to adsorb magnetic beads; after the liquid in the tube is clear and transparent, absorbing and discarding the supernatant;
f. preparation of Apostle MiniMax cfDNA 2nd Wash Solution working Solution: taking 1 brand-new 5mL centrifuge tube, adding 1600 mu L of Apostle MiniMax cfDNA 2nd Wash Solution and 400 mu L of absolute ethyl alcohol, reversing, uniformly mixing, and centrifuging for standby briefly;
g. adding 1mL of Apostle MiniMax cfDNA 2nd Wash Solution working Solution into the 1.5mL of centrifugal tube in the step e, oscillating and rinsing for 30s, centrifuging for a short time, placing on a magnetic rack for magnetic bead adsorption, and absorbing and discarding supernatant after the liquid in the tube is clear and transparent; repeating the operation, and sucking and discarding the residual Apostle MiniMax cfDNA 2nd Wash Solution working Solution by using a 20-mu L sucker;
h. opening a cover of the 1.5mL centrifuge tube treated in the step g, placing the centrifuge tube on a magnetic frame, and airing the centrifuge tube for 5min at room temperature until no mirror surface is reflected on the surface of the magnetic bead; adding 50 mu L of Apostle MiniMax cfDNA Solution into the magnetic beads, and oscillating and dissolving back for 5min at room temperature; after magnetic beads are collected by short-time centrifugation to the bottom of the tube, placing a 1.5mL centrifuge tube on a magnetic frame to adsorb the magnetic beads; after the liquid in the tube is clear and transparent, transferring the supernatant into a new prepared 1.5mL centrifuge tube to obtain the urine supernatant cfDNA.
2. In the kit, the method for genome sequencing specifically comprises the following steps (the sequencing sample is the urine supernatant cfDNA extracted as above):
(1) End repair and addition of "A" tails
Transfer 10ng of the sequenced sample into a PCR tube and supplement the volume to 50. Mu.L with Nuclear Free Water (ThermoFisher); adding a terminal repair and an A tail reagent into a PCR tube according to the following reaction system, shaking, mixing uniformly, centrifuging instantaneously, and placing in a PCR instrument to run the following PCR reaction program.
Reaction system: end Repair & A-leaving Buffer 7 uL/RXN; end Repair & A-Tailing Enzyme Mix 3. Mu.L/RXN.
PCR reaction procedure: 30min at 20 ℃; 30min at 65 ℃; storing at 4 deg.C, and covering with a hot lid at 70 deg.C.
(2) Joint connection
Adding a joint connection reaction component into the reaction system after the reaction in the step (1) is finished according to the following reaction system, shaking, uniformly mixing, performing instantaneous centrifugation, and placing in a PCR instrument to run the following PCR reaction program.
Reaction system: repairing the tail end and adding an A tail product of 60 mu L/RXN; DNA linker 2.5. Mu.L/RXN; nucleic Free Water 7.5 u L/RXN; ligation Buffer 30. Mu.L/RXN; DNA Ligase Enzyme 10. Mu.L/RXN.
PCR reaction procedure: 20 ℃ for 15min, hot lid OFF.
(3) Purification after linker attachment
Taking out the magnetic beads of Beckman AMPure XP (BECKMAN COULTER) from a refrigerator at 2-8 ℃, shaking and uniformly mixing, and balancing at room temperature for 30min;
transferring the reaction system after the reaction in the step (2) to a new 1.5mL centrifuge tube, adding 88 mu L of AMPure XP magnetic beads, shaking, uniformly mixing, performing instant centrifugation, and incubating at room temperature for 10min;
after incubation is finished, placing a 1.5mL centrifuge tube on a magnetic frame to adsorb magnetic beads; after the solution is clear and transparent, absorbing and discarding the supernatant;
adding 400 mu L of 80% ethanol into a 1.5mL centrifuge tube, rotating the tube for one week, sucking and discarding the supernatant, repeating the steps, then performing instantaneous centrifugation, placing the centrifuge tube on a magnetic frame to adsorb magnetic beads, and sucking and discarding the residual ethanol by using a 20 mu L sucker. Keeping a 1.5mL centrifuge tube on a magnetic frame, and airing for 5min at room temperature until no mirror surface is reflected on the surface of the magnetic bead;
adding 21 mu L of nucleic Free water into a 1.5mL centrifuge tube, shaking and uniformly mixing, and incubating for 5min at room temperature;
and (3) instantly centrifuging a 1.5mL centrifuge tube, placing the centrifuge tube on a magnetic frame to adsorb magnetic beads, and transferring the supernatant into a pre-prepared 0.2 mL PCR tube after the solution in the tube is clear and transparent to obtain a purified adaptor connection product.
(4) Library amplification
And (4) adding each PCR component in the library amplification system into the PCR tube filled with the purified joint connection product obtained in the step (3) according to the following reaction system, performing vortex oscillation, mixing uniformly, performing instantaneous centrifugation, and placing the mixture in a PCR instrument to run the following PCR reaction program.
Reaction system: the purified linker ligation product is 20. Mu.L/RXN; KAPA HiFi HotStart ReadyMix 25 uL/RXN; (P5 + P7) Primer Mix 5. Mu.L/RXN.
PCR reaction procedure: 45s at 98 ℃; (98 ℃ 15S; 60 ℃ 30S; 72 ℃ 30S) 6cycles; 1min at 72 ℃; hold at 4 ℃.
(5) Purification of libraries after amplification
Taking out the Beckmann AMPure XP magnetic beads from a refrigerator at 2-8 ℃, shaking and uniformly mixing, and balancing at room temperature for 30min;
transferring the reaction system amplified in the step (2) into a new 1.5mL centrifuge tube, adding 80 mu L of AMPure XP magnetic beads, shaking, mixing uniformly, centrifuging instantaneously, and incubating at room temperature for 10min;
after the incubation is finished, placing a 1.5mL centrifuge tube on a magnetic frame to adsorb magnetic beads; after the solution is clear and transparent, absorbing and discarding the supernatant;
adding 400 mu L of 80% ethanol into a 1.5mL centrifuge tube, rotating the tube for one week, and then absorbing and removing the supernatant; repeating the steps once, then performing instantaneous centrifugation, and putting the centrifugal tube on a magnetic frame to adsorb magnetic beads; the remaining ethanol was removed with a 20. Mu.L pipette tip; keeping a 1.5mL centrifuge tube on a magnetic frame, and airing for 5min at room temperature until no mirror surface is reflected on the surface of the magnetic bead;
adding 21 mu L of nucleic Free water into a 1.5mL centrifuge tube, shaking and uniformly mixing, and incubating for 5min at room temperature;
and (3) instantly centrifuging a 1.5mL centrifuge tube, placing the centrifuge tube on a magnetic rack to adsorb magnetic beads, and transferring the supernatant into a prepared 1.5mL centrifuge tube after the solution in the tube is clear and transparent to obtain a purified library.
(6) Library quality testing and sequencing
For the purified library of step (5), quantification was performed using a Qubit 4 fluorometer in combination with the Qubit dsDNA HS Assay Kit (thermolasher). Library fragment quality control was performed using the Agilent 2100 bioanalyzer in combination with the Agilent 2100 DNA 1000 Kit. Before the on-machine sequencing, the ABI StepOne Plus is used for quantifying the number of moles of the library, and the 21G on-machine sequencing is pre-arranged according to the quantification result. Sequencing on the machine was performed using a Novaseq 6000 sequencer, with an original depth of sequencing of 7 ×. And obtaining a sequencing result of the urine supernatant cfDNA through sequencing.
4. In the kit, the method for extracting the genomic DNA of the urine sediment cells specifically comprises the following steps:
(1) Collection and transport of urine samples
Collecting urine; the Urine was collected by using 2 sterile 50mL centrifuge tubes each pre-filled with 3.5mL Urine Conditioning Buffer (ZYMO RESEARCH) to obtain a Urine sample.
Transport environment and time: the transportation temperature is 2-8 ℃; the transport time does not exceed 3d.
(2) Urine sediment cell genome DNA extraction
1) Pretreating urine:
a. placing the 2 tubes of 50mL urine samples obtained in the step (1) in a normal temperature centrifuge, and centrifuging at 1600 Xg for 10min. After centrifugation, discarding the supernatant and retaining the precipitate;
b. adding 1mL of PBS into the 2 tubes of urine sediment respectively, blowing and beating the heavy suspension sediment by using a pipette, then transferring the sediment suspension into 2 prepared 1.5mL centrifuge tubes respectively, centrifuging for 1min at 12000g, and removing the supernatant;
c. repeat step b once, finally use 1mL PBS heavy suspension urine sediment, for DNA extraction.
2) Extracting DNA of urine sediment cells: the specific operation flow is as follows:
a. collecting urine sediment cells, centrifuging at 12000g for 1min at room temperature, removing supernatant, and collecting urine sediment cells;
b. adding 200 muL of Buffer GA into the urine sediment cells collected in the step a, blowing and uniformly mixing, adding 20 muL of protease K (10 mg/mL), performing vortex mixing for 2-3s, then performing instantaneous centrifugation, and collecting the cracking system to the bottom of the tube;
c. 200 μ L of Buffer GB was added to the tube in step b, and the tube was allowed to stand in a previously set metal bath at 70 ℃ for 10min. Vortex every 3min for 10s. After the incubation is finished, centrifuging for 2-3s for a short time to remove the liquid drops on the inner wall of the tube cover;
d. adding 200 mu L of absolute ethyl alcohol into the system in the step c, fully oscillating and uniformly mixing for 15s, and centrifuging for 2-3s for a short time to remove liquid drops on the inner wall of the tube cover;
e. and d, standing the system at room temperature for 2-3min, and then transferring the cleavage system to an adsorption column CB3 according to 600 mu L of the cleavage system each time. Centrifuging at 12,000rpm for 1min, and discarding the waste liquid;
f. adding 500 μ L Buffer GD (anhydrous ethanol is confirmed to be added before use) into the adsorption column, centrifuging at 12,000rpm for 1min, and discarding the waste liquid;
g. adding 600 μ L Buffer PW (absolute ethanol is added before use), centrifuging at 12,000rpm for 1min, and discarding waste solution;
h. repeating step g once;
i. transferring the CB3 adsorption column in the step h into a new waste liquid collecting tube, centrifuging at 12000rpm for 2min, and removing residual ethanol;
j. transferring the centrifuged CB3 adsorption column into a pre-prepared 1.5mL centrifuge tube, uncovering and airing for 5min;
k. and suspending and dripping 53 mu L of elution buffer TE into the middle position of the adsorption film, and standing for 5min at room temperature. Then, the mixture was centrifuged at 12000rpm for 2min, and the adsorption column was discarded. The DNA solution in the 1.5mL centrifuge tube is the extracted urine sediment genome DNA for quantification and qMSP detection.
5. In the kit, the detection method of the CpG methylation of the gene characteristic region is specifically as follows, and the detection sample is the urine sediment genome DNA extracted above.
The method specifically comprises the following steps:
(1) DNA bisulfite conversion
1) To the PCR tube, 20. Mu.L of the DNA sample and 130. Mu.L of the Lightning Conversion Reagent were added, respectively. After the pipe cover is tightly covered, fully shaking, uniformly mixing and instantaneously centrifuging;
2) Place the PCR tube in the PCR instrument, run the program: 8min at 98 ℃; 60min at 54 ℃; hold at 4 ℃;
3) Adding 600. Mu.L of M-Binding Buffer to a Zymo-Spin ™ IC Column (hereinafter referred to as Spin Column) and placing in a collection tube;
4) C, placing the sample in the centrifugal column in the step c, covering the tube cover tightly, and then reversing and mixing uniformly;
5) Centrifuging at 12000g for 30s, and discarding the waste liquid;
6) Adding 100 mu L of M-Wash Buffer into a centrifugal column, and centrifuging for 30s at 12000 g;
7) Adding 200 mu L of L-depletion Buffer into a centrifugal column, standing and incubating at room temperature for 15min, and centrifuging at 12000g for 30s;
8) Adding 200 mu L of M-Wash Buffer into a centrifugal column, and centrifuging for 30s at 12000 g;
9) The column was placed in a new 1.5mL centrifuge tube, 10. Mu.L M-Elution Buffer was added, 12000g was centrifuged for 30s, and the DNA solution was collected for methylation detection.
(2) Methylation-Specific Quantitative PCR (Quantitative Methylation-Specific PCR, qMSP)
1) Each batch of qMSP experiments needs to include quality controls as shown below.
Template-free reaction wells (NTC): sample nucleic-Free Water, reaction number 1. Weak positive methylation quality control product: sample 5% methylation rate DNA, reaction number 1. Negative methylation quality control product: sample blood cell DNA, reaction number 1.
2) qMSP was performed using EpiTect MethyLight PCR + ROX visual Kit (QIAGEN) reagents and a qMSP reaction system was formulated: DNA 5 mu L/rxn after bisulfite conversion; 2 × EpiTect MethyLight Master Mix 10 μ L/rxn; upstream primer (ACTB & Target, 10. Mu.M) 0.8. Mu.L/rxn; upstream primer (ACTB & Target, 10. Mu.M) 0.8. Mu.L/rxn; fluorescent probe (ACTB & Target, 10. Mu.M) 0.4. Mu.L/rxn; 50x ROX Dye Solution 0.4 μ L/rxn; nuclean-Free Water 2.6 u L/rxn.
3) The reaction was placed in StepOneNus Real-Time PCR Systems (Applied Biosystems) and the following procedure was run: collecting fluorescence at 95 deg.C for 10min, (95 deg.C for 20s,60 deg.C for 1 min) 45cycles at 60 deg.C;
and d, after the qMSP reaction is finished, manually adjusting the threshold value of the gene to be detected and the threshold value of the reference gene in StepOnePlus analysis software, and then analyzing and calculating the methylation value.
Examples
Example 1
The embodiment provides a construction method of a prostate cancer early-stage screening model based on urine. The detection sample is obtained by extracting and genome sequencing urine supernatant cfDNA of the urine sample by using the kit provided by the preparation example, and obtaining a sequencing result of the urine supernatant cfDNA.
In this example, there were 107 urine samples including 47 healthy urine samples and 60 tumor urine samples as training sets.
The construction method of the early prostate cancer screening model specifically comprises the following steps:
(1) The kit provided by the preparation example is used for extracting the urine supernatant cfDNA and sequencing the genome of 107 urine samples in the training set, so as to obtain the sequencing result of the urine supernatant cfDNA.
(2) Performing quality control on the sequencing result of the urine supernatant cfDNA obtained in the step (1), comparing to the hg19 reference genome, and performing sequencing and de-duplication operations to obtain a bam file; the obtained bam files were screened, reads for multiple comparisons and poor quality were deleted, and the following analyses were performed.
And (3) obtaining a bam file from four-dimensional analysis of CNV events, motif types, fragment distribution characteristics and methylation characteristics. The method comprises the following specific steps:
method for analysis of Copy Number Variation (CNV). The method specifically comprises the following steps:
1) Constructing a genome bed file: bam files were divided by bins of 200Kbp length using the hg19 reference genome; and calculating the coverage of each bin according to bins with the length of 200Kbp, firstly carrying out GC correction on the calculated coverage, and then carrying out self-standardization to obtain a standardized sample.
And correcting the read count of each bin by using a stress command of R3.6.0 software in GC correction to obtain the corrected read count of each bin, which is recorded as RC gc
The calculation formula of the RC value after each bin itself is normalized is as follows:
RC=RC gc /mean(RC gc-all-bin )。
mean(RC gc-all-bin ) Presentation statisticsThe average of the read numbers after GC correction for all bin intervals.
2) Constructing baseline: randomly selecting 20 healthy urine samples from the standardized samples obtained in the step 1); based on the selected samples, regions with no information and regions with a lot of noise, such as centromere, telomere, repeat regions, etc., are first processed and then the expected coverage and median segment variance for each bin is calculated.
Wherein, the calculation formula of the expected coverage is as follows:
desired coverage RC E =mean(RC)。
mean (RC) represents the mean of the number of reads counted after self-normalization of all bin intervals.
The calculation formula of the median piecewise variance is as follows:
median piecewise varianceMSV=1/(RC ∗ bin size)。
The bin size indicates the size of the bin interval.
3) The expected coverage and median segment variance in baseline obtained in step 2) were used for self-normalization of other urine samples.
4) Calculating log by using expected coverage and median segment variance in the baseline obtained in the steps 2) and 4) 2 ratio。log 2 ratio =log 2 (RC/RC E )。
Then, the Cyclic Binary Segmentation (CBS) method is used to distinguish the segments, and the average log of each segment is obtained 2 A ratio. According to mean log 2 And (3) determining whether the segment has the CNV event or not by the ratio, and accumulating the total length of the segment with the CNV event on each autosome of the detection sample according to the determination result and recording as CNV _ length.
The provided CNV event analysis method is learned by using a Support Vector Machine (SVM), and statistical analysis processing is performed. CNV _ length for 107 urine samples in the training set is shown in Table 1.
The CNV _ lengths provided in table 1 are the sum of CNV _ lengths of 2p, 3p, 5q, 6q, 7p, 8q, 13q chromosomes.
TABLE 1 analytical results for 107 urine samples-CNV events
Figure 926266DEST_PATH_IMAGE001
5) Each chromosome arm was tested for the ability to act as a marker using the rank sum test (one-sided test) in units of chromosome arms.
By judging the threshold p <0.0001, the total segment length (CNV _ length) of CNV events in tumor samples was significantly higher than the chromosomal arm of CNV _ length in healthy samples as marker.
After analysis, the markers which can be used as CNV events are finally determined to be 2p, 3p, 5q, 6q, 7p, 8q and 13q (numbers indicate chromosome numbers; p indicates a chromosome short arm; and q indicates a chromosome long arm).
Type of analysis method. The method specifically comprises the following steps:
1) Counting the terminal sequence motif types of all 4BP, counting the occurrence times of various motif types from a bam file, and then calculating the proportion of each motif type in all the occurring motif types.
Wherein, the calculation formula of the proportion of each motif type in all the occurring motif types is as follows:
the occupied ratio = (number of occurrences of a certain motif type)/(total number of occurrences of all motif types).
2) And learning the analysis method of the motif type by using an SVM (support vector machine), performing statistical analysis processing, and determining a marker of the motif type.
A rank-sum test (one-sided test) was used for the 256 motif types, respectively, and an appropriate motif type was selected as a marker according to a threshold p < 0.0001. Through statistics, 53 motif types are combined in 107 urine samples.
The results of the analysis of 53 motifs in 107 urine samples are shown in tables 2-5.
TABLE 2 analysis results of 107 urine samples-motif type (one)
Figure 656456DEST_PATH_IMAGE002
Figure 232930DEST_PATH_IMAGE003
Figure 300244DEST_PATH_IMAGE004
Figure 662086DEST_PATH_IMAGE005
Figure 281286DEST_PATH_IMAGE006
Figure 626948DEST_PATH_IMAGE007
Figure 153744DEST_PATH_IMAGE008
TABLE 3 analysis results of 107 urine samples-motif type (II)
Figure 25885DEST_PATH_IMAGE009
Figure 97878DEST_PATH_IMAGE010
Figure 914524DEST_PATH_IMAGE011
Figure 464585DEST_PATH_IMAGE012
Figure 315866DEST_PATH_IMAGE013
Figure 824339DEST_PATH_IMAGE014
Figure 269227DEST_PATH_IMAGE015
TABLE 4 analysis results of 107 urine samples-motif type (III)
Figure 872247DEST_PATH_IMAGE016
Figure 594346DEST_PATH_IMAGE017
Figure 991830DEST_PATH_IMAGE018
Figure 799380DEST_PATH_IMAGE019
Figure 81456DEST_PATH_IMAGE020
Figure 251538DEST_PATH_IMAGE021
Figure 85502DEST_PATH_IMAGE022
TABLE 5 analysis results of 107 urine samples-motif type (IV)
Figure 239402DEST_PATH_IMAGE023
Figure 200536DEST_PATH_IMAGE024
Figure 880916DEST_PATH_IMAGE025
Figure 902093DEST_PATH_IMAGE026
Figure 808869DEST_PATH_IMAGE027
Figure 557382DEST_PATH_IMAGE028
3) The signature marker finally determined according to a threshold p <0.0001 is the 53 types of motif present, ATAC, ATAT, ATAG, ATGA, ATGT, AGCC, AGTC, TACG, TTAT, TTCA, TGAG, TGGG, TGGC, TGAC, TGCG, TCAC, TCAG, TCTC, TCTA, TCCG, GAAC, GAAT, GATC, GAGC, GACG, GACC, GTAT, GGCT, GGCG, GGCC, GCAG, GCTC, GCCG, GCCC, CATC, CACG, CTAA, CTAT, CTAG, CTAC, CTTA, CTTT, CTTG, CTTC, CTGC, CTGT, CTCT, CGAG, CGGT, CGCG, CCAT.
Analysis method of fragment distribution characteristics
1) Fragment characteristics of the bam file were analyzed: determining the threshold value of the large and small segments by using variance analysis, and obtaining the specific value of the threshold value, namely, the gene segment is larger than the threshold value and is a long segment, and the gene segment is smaller than the threshold value and is a short segment.
Analyzing by using a Gradient Boosting Machine (GBM) method, and obtaining a threshold value of 145BP through analysis, namely, the fragment with the length larger than 145BP is a long fragment, and the fragment with the length smaller than 145BP is a short fragment.
2) Dividing the genome into intervals with the length of 5Mb, counting the number of long fragments and the number of short fragments on the comparison pairs in each interval, and calculating the ratio.
Wherein, the calculation formula of the ratio is as follows:
the ratio = number of long segments in the interval/number of short segments in the interval.
And learning the analysis method of the fragment distribution characteristics by using the SVM, performing statistical analysis processing, and determining a mark marker of the fragment distribution characteristics. And obtaining the analysis results of the fragment distribution characteristics of 107 urine samples in the training set (the analysis results of the healthy urine samples and the analysis results of the tumor urine samples).
3) And comparing the ratio of the analysis results of the healthy urine sample with the ratio of the analysis results of the tumor urine sample.
Through comparison, the ratio of the analysis result of the healthy urine sample to the ratio of the analysis result of the tumor urine sample are different. And finally determining and selecting the long fragment distribution characteristics and the short fragment distribution characteristics on all autosomes as a marker according to the comparison result.
Method for analysis of methylation signatures. The method specifically comprises the following steps:
1) Differential analysis is carried out through tissue data of TCGA, methylated differential genes and an internal reference gene (ACTB gene) are selected, the differential methylated genes of a detection sample are quantified by using a Methylation Specific polymerase chain reaction (qMSP) method, and the Methylation values of the differential methylated genes are calculated by using the internal reference gene.
And (3) learning the analysis method of the methylation characteristics by using an SVM (support vector machine), and performing statistical analysis processing. Six differential genes screened by using TCGA data are HOXD3, RASSF1, GSTP1, CDH1, FOXP1 and APC respectively, and methylation values of the six differential genes are analyzed.
According to the method of preparation example, qMSP reaction was terminated, HOXD3 gene Δ Rn threshold was manually adjusted to 1.0, RASSF1 gene Δ Rn threshold was adjusted to 1.0, GSTP1 gene Δ Rn threshold was adjusted to 1.1, CDH1 gene Δ Rn threshold was adjusted to 0.8, FOXP1 gene Δ Rn threshold was adjusted to 0.8, APC gene Δ Rn threshold was adjusted to 0.4, ACTB gene (reference gene) Δ Rn threshold was adjusted to 0.4 in StepOnePlus analysis software for analysis, and then HOXD3-ACTB, RASSF1-ACTB, GSTP1-ACTB, CDH1-ACTB, FOACTB 1-ACTB and APC-ACTB Δ C of ACTB were calculated T Values to obtain Δ C for six differential genes T The value is obtained.
And (4) interpretation of results:
HOXD3 Gene Δ C T The HOXD3 target is judged to be positive if the value is less than or equal to 12.0, and the HOXD3 target is judged to be negative if the value is more than 12.0.
RASSF1 Gene Δ C T The RASSF1 target is judged to be positive if the value is less than or equal to 18.0, and the RASSF1 target is judged to be negative if the value is more than 18.0.
GSTP1 Gene Δ C T The GSTP1 target is determined as positive when the value is less than or equal to 11.0, and the GSTP1 target is determined as negative when the value is more than 11.0.
CDH1 Gene Δ C T The value is less than or equal to 13, the CDH1 target is judged to be positive, and the CDH1 target is judged to be negative if the value is more than 13.
FOXP1 gene delta C T The FOXP1 target is judged to be positive if the value is less than or equal to 15, and the FOXP1 target is judged to be negative if the value is more than 15.
APC Gene. DELTA.C T The value is less than or equal to 7.0, the APC target is judged to be positive, and the value is more than 7.0, the APC target is judged to be negative.
And (3) integrating the detection results of the six targets, judging the sample with the positive detection result of more than or equal to 1 target as a prostatic cancer urine sample, and judging the sample with the negative detection result of all targets as a healthy urine sample.
The results of analysis of six differential genes in 107 urine samples are shown in Table 6. Finally, the marker of the methylation characteristics is determined to be 6 differential genes: HOXD3 gene, RASSF1 gene, GSTP1 gene, CDH1 gene, FOXP1 gene, APC gene.
TABLE 6 analysis of 107 urine samples-methylation characteristics
Figure 498794DEST_PATH_IMAGE029
Figure 690872DEST_PATH_IMAGE030
Figure 943998DEST_PATH_IMAGE031
Figure 246935DEST_PATH_IMAGE032
(3) And (4) counting the analysis results of the 4 dimensions, constructing an analysis matrix by using the finally screened mark marker, and establishing a linear regression model.
Marker markers in 4 dimensions of CNV events, motif types, fragment distribution characteristics and methylation characteristics obtained from 107 urine samples according to step (2) were integrated into one matrix, marker data in each dimension obtained from 107 urine samples were normalized using the scaler. The normalized data are shown in table 7.
TABLE 7 data results after normalization of 107 urine samples
Figure 777273DEST_PATH_IMAGE033
Figure 389520DEST_PATH_IMAGE034
And (5) constructing a linear regression model by the standardized data to obtain a classification model and storing the classification model.
The linear regression model is specifically as follows:
Y=(A×a)+{(B 1 ×b 1 )+(B 2 ×b 2 )+(B 3 ×b 3 )+……+(B M-2 ×b M-2 )+(B M-1 ×b M-1 )+(B M ×b M )}+
{(C 1 ×c 1 )+(C 2 ×c 2 )+(C 3 ×c 3 )+……+(C N-2 ×c N-2 )+(C N-1 ×c N-1 )+(C N ×c N )}+
{(D 1 ×d 1 )+(D 2 ×d 2 )+……+(D P-2 ×d P-2 )+(D P-1 ×d P-1 )+(D P ×d P )}。
in the above model, Y represents the predicted score of the urine sample finally obtained from CNV events, motif types, segment distribution characteristics, methylation characteristics analysis.
A represents the fitting coefficient for CNV events; a represents the total length of segment (CNV _ length) of CNV event on each autosome of the detection sample.
B 1 、B 2 、B 3 ……B M-2 、B M-1 、B M (B 1 ~B M ) Representing the fitting coefficients respectively corresponding to the marker in the motif type; b 1 、b 2 、b 3 ……b M-2 、b M-1 、b M (b 1 ~b M ) And representing analysis results respectively corresponding to the marks marker in the motif type. In this application, there are 53 marker markers in the motif type, so M is 53.
C 1 、C 2 、C 3 ……C N-2 、C N-1 、C N (C 1 ~C N ) Representing the fitting coefficients respectively corresponding to all ratios obtained by calculation in the segment distribution characteristics; c. C 1 、c 2 、c 3 ……c N-2 、c N-1 、c N (c 1 ~c N ) All ratios calculated in the segment distribution profile are indicated. In the present application, 504 ratios are obtained by calculation in the segment distribution characteristics, so M is 504 (ratio).
D 1 、D 2 ……D P-2 、D P-1 、D P (D 1 ~D P ) Representing the fitting coefficients corresponding to the differential genes screened from the methylation characteristics respectively; d is a radical of 1 、d 2 ……d P-2 、d P-1 、d P (d 1 ~d P ) Δ C representing respective correspondence of differential genes selected from methylation profiles T The value is obtained. In the present application, since the number of selected differential genes is 6, P is 6.
The coefficients involved in the regression model are specifically shown in table 8.
TABLE 8 coefficients involved in the regression model
Figure 270889DEST_PATH_IMAGE035
Figure 377516DEST_PATH_IMAGE036
Figure 152574DEST_PATH_IMAGE037
Figure 545509DEST_PATH_IMAGE038
Figure 789540DEST_PATH_IMAGE039
Figure 683547DEST_PATH_IMAGE040
Figure 454056DEST_PATH_IMAGE041
Example 2
The present embodiment provides a method for using a urine-based early stage prostate cancer screening model, and can be used to verify the accuracy of the urine-based early stage prostate cancer screening model provided in example 1. The detection sample is obtained by extracting and genome sequencing urine supernatant cfDNA of the urine sample by using the kit provided by the preparation example, and obtaining a sequencing result of the urine supernatant cfDNA. In this example, there were 106 urine samples, including 46 healthy urine samples and 60 tumor urine samples, as the test set. The method comprises the following specific steps:
(1) The kit provided by the preparation example is used for extracting the urine supernatant cfDNA and sequencing the genome of 106 urine samples in the test set, so as to obtain the sequencing result of the urine supernatant cfDNA.
(2) By using the mark markers with 4 dimensions of the CNV event, the motif type, the segment distribution feature and the methylation feature provided in step (2) in example 1, the feature value of the mark marker in each dimension is calculated for 106 urine samples, and the obtained feature value data are combined to generate a matrix.
(3) The data of 106 urine samples are analyzed by using the classification model obtained in the embodiment 1, the final prediction score of each urine sample is obtained, the predicted classification result is determined according to the prediction scores, and an ROC curve graph is drawn.
The predicted scores and predicted classification results for the 106 urine samples are shown in table 9, and the ROC graph is shown in fig. 1.
FIG. 1 is a ROC plot for early stage screening for prostate cancer based on urine in example 2 of the present application. As can be seen from fig. 1, AUC is 0.944.
TABLE 9 prediction scores and predicted Classification results for 106 urine samples
Figure 893259DEST_PATH_IMAGE042
Figure 749220DEST_PATH_IMAGE043
The accuracy of the predicted classification results obtained in table 9 was checked and the results are shown in table 10.
TABLE 10 accuracy test results
Figure 446917DEST_PATH_IMAGE044
The above accuracy calculation formula is as follows: accuracy (%) = number of samples whose classification result is accurate/total number of samples in the predicted classification result × 100. According to the results, the screening model provided by the application can accurately screen the prostate cancer sample in the early stage, the accuracy is as high as 94.33%, and the clinical application value is good.

Claims (5)

1. A construction method of a prostate cancer early-stage screening model based on urine is characterized by specifically comprising the following steps:
extracting urine of the healthy sample and the cancer sample and sequencing to obtain a sequencing result;
analyzing the sequencing result to obtain prostate cancer related mutation markers of the healthy sample and the cancer sample; the prostate cancer related mutant marker comprises a CNV event, a motif type, a fragment distribution characteristic and a methylation characteristic;
integrating the mark marker data of the prostate cancer related mutation marker into a matrix, standardizing the mark marker data of each dimension by using a scaler.
The analysis method of the CNV event specifically comprises the following steps:
1) Obtaining a bam file from sequencing results of said urine, the bam file being divided according to bins of 200Kbp length using hg19 reference genome; calculating the coverage of each bin according to bins with the length of 200Kbp, firstly carrying out GC correction on the coverage obtained by calculation, and then carrying out self-standardization to obtain a standardized sample;
and the GC correction corrects the read count of each bin by using the stress command of R3.6.0 software to obtain the corrected read count of each bin, and the corrected read count is recorded as RC gc
The formula for calculating the RC value after each bin is normalized by itself is as follows: RC = RC gc /mean(RC gc-all-bin );
mean(RC gc-all-bin ) Representing the average number of read numbers after GC correction of all bin intervals;
2) Constructing baseline: randomly selecting 20 healthy urine samples from the standardized samples obtained in the step 1); based on the selected samples, first processing the areas without information and the areas with a large amount of noise, and then calculating the expected coverage and median segment variance of each bin;
wherein the desired coverage is calculated as follows: desired coverage RC E =mean(RC);
mean (RC) represents the average number of read numbers after statistics of all bin intervals are normalized;
wherein, the calculation formula of the median piecewise variance is as follows:
median piecewise varianceMSV= 1/(RC 8727; bin size); bin size indicates the size of the bin interval;
3) Using the expected coverage and median segment variance in baseline obtained in step 2) for self-normalization of other urine samples;
4) Calculating log by using the expected coverage and median segment variance in baseline obtained in step 2) and step 4) 2 ratio;log 2 ratio =log 2 (RC /RC E );
Then, the segments are distinguished by CBS method to obtain average log of each segment 2 A ratio; according to mean log 2 Determining whether the segment has a CNV event or not by ratio, and accumulating the total length of the segments with the CNV event on each autosome in the health sample and the cancer sample according to the determination result;
learning the analysis method of the CNV event by using an SVM, and performing statistical analysis processing; by judging that the threshold value p is less than 0.0001, the chromosome arm of which the total length of the segment of the CNV event in the tumor sample is obviously higher than that of the segment of the CNV event in the healthy sample is used as a mark marker, and the mark marker of the CNV event is finally determined;
the method for analyzing the motif type specifically comprises the following steps:
1) Obtaining a bam file from the sequencing result of the urine, counting the number of occurrences of each motif type from the bam file, and then calculating the proportion of each motif type in all the occurring motif types;
wherein, the calculation formula of the proportion of each motif type in all the occurring motif types is as follows:
the occupied ratio = (number of occurrences of a certain motif type)/(total number of occurrences of all motif types);
learning the provided motif type analysis method by using an SVM (support vector machine), performing statistical analysis processing, and determining a marker of the motif type; respectively using rank sum test for 256 motif types, selecting the motif types as mark markers according to a threshold value p <0.0001, and finally determining the mark markers as the motif types;
the analysis method of the fragment distribution characteristics specifically comprises the following steps:
1) Obtaining a bam file from the sequencing result of the urine, determining the threshold value of the large fragment and the small fragment by using variance analysis, and obtaining the specific value of the threshold value, wherein the gene fragment is larger than the threshold value and is a long fragment, and the gene fragment is smaller than the threshold value and is a short fragment;
GBM method is used for analysis, and after analysis, the threshold value of 145BP is obtained, namely the fragment length is larger than 145BP, the fragment is a long fragment, and the fragment length is smaller than 145BP, the fragment is a short fragment;
2) Dividing the genome into intervals with the length of 5Mb, counting the number of long fragments and the number of short fragments in a comparison in each interval, and calculating the ratio;
wherein, the calculation formula of the ratio is as follows: the ratio = number of long segments/number of short segments within the interval;
3) Comparing the ratio of the analysis results of the healthy urine samples with the ratio of the analysis results of the tumor urine samples, counting long-segment distribution characteristics and short-segment distribution characteristics with different ratios as mark markers, and finally determining the mark markers of the segment distribution characteristics;
the method for analyzing the methylation characteristics specifically comprises the following steps:
according to the sequencing result of urine, carrying out differential analysis through tissue data of TCGA, selecting a methylated differential gene and an internal reference gene ACTB gene, quantifying the differential methylated gene of a detection sample by using methylation specific polymerase chain reaction, calculating the methylation value of the differential methylated gene by using the internal reference gene, and judging the target of the differential gene to be negative or positive.
2. The method for constructing a model for early screening of prostate cancer based on urine as claimed in claim 1, wherein the marker of said fragment distribution characteristics is a long fragment distribution characteristic and a short fragment distribution characteristic on all autosomes.
3. The method for constructing a model for screening prostate cancer in early stage based on urine as claimed in claim 1, wherein the differential genes screened out by said methylation characteristics include HOXD3 gene, RASSF1 gene, GSTP1 gene, CDH1 gene, FOXP1 gene, APC gene.
4. An early stage screening model for prostate cancer constructed using the construction method of any one of claims 1 to 3.
5. A kit for extracting and detecting urine in the construction method according to any one of claims 1 to 3; the kit comprises a urine supernatant cfDNA extraction part, a genome sequencing part, a urine sediment cell genome DNA extraction part and a gene characteristic region CpG methylation detection part.
CN202210660400.XA 2022-06-13 2022-06-13 Construction method of prostate cancer early screening model based on urine, screening model and kit Active CN114743593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210660400.XA CN114743593B (en) 2022-06-13 2022-06-13 Construction method of prostate cancer early screening model based on urine, screening model and kit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210660400.XA CN114743593B (en) 2022-06-13 2022-06-13 Construction method of prostate cancer early screening model based on urine, screening model and kit

Publications (2)

Publication Number Publication Date
CN114743593A CN114743593A (en) 2022-07-12
CN114743593B true CN114743593B (en) 2023-02-24

Family

ID=82286902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210660400.XA Active CN114743593B (en) 2022-06-13 2022-06-13 Construction method of prostate cancer early screening model based on urine, screening model and kit

Country Status (1)

Country Link
CN (1) CN114743593B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116564508B (en) * 2023-07-07 2023-09-29 北京橡鑫生物科技有限公司 Early prostate cancer screening model and construction method thereof

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2491067A1 (en) * 2004-12-24 2006-06-24 Stichting Katholieke Universiteit Mrna rations in urinary sediments and/or urine as a prognostic marker for prostate cancer
WO2011049237A1 (en) * 2009-10-23 2011-04-28 財団法人東京都医学研究機構 Biomarkers for predicting therapeutic effect in hyposensitization therapy
US20140094380A1 (en) * 2011-04-04 2014-04-03 The Board Of Trustees Of The Leland Stanford Junior University Methylation Biomarkers for Diagnosis of Prostate Cancer
US11180807B2 (en) * 2011-11-04 2021-11-23 Population Bio, Inc. Methods for detecting a genetic variation in attractin-like 1 (ATRNL1) gene in subject with Parkinson's disease
KR101987477B1 (en) * 2012-05-07 2019-06-10 엘지전자 주식회사 Method for discovering a biomarker
WO2014144822A2 (en) * 2013-03-15 2014-09-18 Immumetrix, Inc. Methods and compositions for tagging and analyzing samples
CN104004840B (en) * 2014-05-26 2016-06-08 高新 Test kit for early screening Yu diagnosis of prostate cancer
CN115273970A (en) * 2016-02-12 2022-11-01 瑞泽恩制药公司 Method and system for detecting abnormal karyotype
EP3606960A1 (en) * 2017-04-03 2020-02-12 Oncologie, Inc. Methods for treating cancer using ps-targeting antibodies with immuno-oncology agents
CN107475388B (en) * 2017-08-22 2020-05-19 深圳市恩普电子技术有限公司 Application of nasopharyngeal carcinoma related miRNA as biomarker and nasopharyngeal carcinoma detection kit
CN108315404B (en) * 2018-01-25 2022-05-24 广州精科医学检验所有限公司 Method and system for determining fetal beta thalassemia gene haplotype
CN108531594A (en) * 2018-04-19 2018-09-14 安徽达健医学科技有限公司 A kind of polygene combined non-invasive detection methods and its kit for carcinoma of urinary bladder early screening
GB201905111D0 (en) * 2019-04-10 2019-05-22 Uea Enterprises Ltd Novel biomarkers and diagnostic profiles for prostate cancer
US20210215610A1 (en) * 2020-01-09 2021-07-15 Virginia Tech Intellectual Properties, Inc. Methods of disease detection and characterization using computational analysis of urine raman spectra
CN111154841B (en) * 2020-02-06 2024-02-20 江苏圣极基因科技有限公司 Method and kit for detecting absolute copy number of fetal free DNA in maternal plasma based on digital PCR
CN111308095A (en) * 2020-03-04 2020-06-19 北京师范大学 Urine protein marker for diagnosing prostate cancer
US20230223145A1 (en) * 2020-06-01 2023-07-13 20/20 GeneSystems Methods and software systems to optimize and personalize the frequency of cancer screening blood tests
CN113454219B (en) * 2020-08-10 2024-03-08 华大数极生物科技(深圳)有限公司 Methylation marker for liver cancer detection and diagnosis
US20220136062A1 (en) * 2020-10-30 2022-05-05 Seekin, Inc. Method for predicting cancer risk value based on multi-omics and multidimensional plasma features and artificial intelligence
CN112779334B (en) * 2021-02-01 2022-05-27 杭州医学院 Methylation marker combination for early screening of prostate cancer and screening method
CN113838533B (en) * 2021-08-17 2024-03-12 福建和瑞基因科技有限公司 Cancer detection model, construction method thereof and kit
CN114566285B (en) * 2022-04-26 2022-07-19 北京橡鑫生物科技有限公司 Early screening model for bladder cancer, construction method of early screening model, kit and use method of early screening model

Also Published As

Publication number Publication date
CN114743593A (en) 2022-07-12

Similar Documents

Publication Publication Date Title
WO2018137678A1 (en) Second generation sequencing-based method for simultaneously detecting microsatellite locus stability and genomic changes
US20230366034A1 (en) Compositions and methods for diagnosing lung cancers using gene expression profiles
WO2021169875A1 (en) Cancer gene methylation measuring system and cancer in vitro detection method executed in same
CN114566285B (en) Early screening model for bladder cancer, construction method of early screening model, kit and use method of early screening model
Yoo et al. Epigenetic inactivation of HOXA5 and MSH2 gene in clear cell renal cell carcinoma
EP3249051B1 (en) Use of methylation sites in y chromosome as prostate cancer diagnosis marker
WO2021180106A1 (en) Probe composition for detecting five tumors of digestive tract
WO2021180105A1 (en) Probe composition for detecting common cancers of both sexes
CN115418401A (en) Diagnostic assay for urine monitoring of bladder cancer
Xu-Monette et al. A refined cell-of-origin classifier with targeted NGS and artificial intelligence shows robust predictive value in DLBCL
WO2021185274A1 (en) Probe composition for detecting 6 cancers with high incidence in china
CN114743593B (en) Construction method of prostate cancer early screening model based on urine, screening model and kit
CN107630093B (en) Reagent, kit, detection method and application for diagnosing liver cancer
CN114974430A (en) System for cancer screening and method thereof
CN113481299B (en) Targeted sequencing panel for lung cancer detection, kit and method for obtaining targeted sequencing panel
CN114182022A (en) Method for detecting liver cancer specific mutation based on cfDNA base mutation frequency distribution
WO2021175284A1 (en) Probe composition for detecting three types of solid organ tumors
WO2021169874A1 (en) Probe composition for detecting three lumen organ tumors
CN112210601A (en) Colorectal cancer screening kit based on fecal sample
Li et al. A method to evaluate genome-wide methylation in archival formalin-fixed, paraffin-embedded ovarian epithelial cells
WO2023142625A1 (en) Methylation sequencing data filtering method and application
EP2270747B1 (en) Methods for detecting nucleic acid with microarray and program product for use in microarray data analysis
CN114891886B (en) Nucleic acid product, kit and application for diagnosing bladder cancer
JP2021524750A (en) Tumor markers, methylation detection reagents, kits and their use
WO2021185275A1 (en) Probe composition for detecting 11 cancers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant