CN113637742A - High myopia gene detection kit, and high myopia genetic risk assessment system and method - Google Patents

High myopia gene detection kit, and high myopia genetic risk assessment system and method Download PDF

Info

Publication number
CN113637742A
CN113637742A CN202111147516.5A CN202111147516A CN113637742A CN 113637742 A CN113637742 A CN 113637742A CN 202111147516 A CN202111147516 A CN 202111147516A CN 113637742 A CN113637742 A CN 113637742A
Authority
CN
China
Prior art keywords
high myopia
dna
seq
myopia
artificial synthesis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111147516.5A
Other languages
Chinese (zh)
Other versions
CN113637742B (en
Inventor
孙星汉
米豪
蒋传贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu 23 Magic Cube Biotechnology Co ltd
Original Assignee
Chengdu 23 Magic Cube Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu 23 Magic Cube Biotechnology Co ltd filed Critical Chengdu 23 Magic Cube Biotechnology Co ltd
Priority to CN202111147516.5A priority Critical patent/CN113637742B/en
Publication of CN113637742A publication Critical patent/CN113637742A/en
Application granted granted Critical
Publication of CN113637742B publication Critical patent/CN113637742B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Medical Informatics (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Public Health (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a high myopia gene detection kit, and a high myopia genetic risk assessment system and method. The evaluation system includes: the acquisition module is used for acquiring a DNA sample of a tester; the detection module comprises a high myopia gene detection kit and is used for genotyping a DNA sample; the prediction module comprises a high myopia genetic risk prediction model used for scoring the high myopia genetic risk of the tester; and an evaluation module comprising a high myopia risk rating system for determining a high myopia genetic risk rating for the test subject. According to the invention, through the whole genome relevance analysis of large-scale high myopia queue population, an optimal genetic risk prediction model is constructed and selected by using a machine learning method, a high myopia genetic risk assessment system and method specially aiming at Chinese population are provided, and the general screen of the high myopia genetic risk level of large-scale population is realized.

Description

High myopia gene detection kit, and high myopia genetic risk assessment system and method
Technical Field
The invention belongs to the field of high myopia genetic risk assessment, and particularly relates to a high myopia gene detection kit, a high myopia genetic risk assessment system and a high myopia genetic risk assessment method.
Background
In China, the prevention and control of high myopia is not moderate. High Myopia (HM) generally refers to ametropia with a diopter of more than-6.00D or an axial length of the eye of greater than or equal to 26 mm. Recent epidemiological surveys have shown that about 1.63 million people worldwide have high myopia (2.7% of the total population), and this figure is expected to grow to 9.38 million (9.8% of the total population) by the year 2050. In Asian population, the prevalence rate of young people reaches 6.8% -21.6%, and the prevalence rate of middle-aged and elderly people is 0.8% -9.1%. China is a typical high-myopia high-incidence country, and in recent years, the prevalence rate of high myopia of teenagers in China is between 6.69 and 38.4 percent, and the teenagers in China show a youthful trend. According to the latest data disclosed by the defense construction Commission, displaying: the myopia special investigation fully developed in countries from 9 to 12 months in 2020 shows that nearly 10% of myopia students are high myopia, and the proportion increases with the grade, 1.5% of children in the kindergarten at 6 years old are high myopia, the high school stage reaches 17.6%, and the expansion trend of high myopia must be restrained.
Myopia is not fearful, and is fearful to be high myopia. Once high myopia (over 600 degrees, the length of the eye axis is more than 26 mm) is developed, the incidence rate of glaucoma is 14 times higher, the incidence rate of cataract is increased by 3 times, so the high myopia often causes permanent visual impairment and even blindness, and is the second leading cause of blindness in China. Early discovery, early intervention, and reduction of vision loss with high myopia, are important goals for the development of accurate medical care in the country in recent years.
The high myopia has obvious genetic tendency and great early warning significance for the early risk. For simple low-moderate myopia, the gene and environment act together to cause the myopia to progress. At present, family research, twin research and population genetics research of more myopia-related genes reveal that: for the high myopia, especially for the pathologic myopia, the role of the genetic factors is more obvious. Compared with the common myopia, the high myopia has the characteristics of early onset, rapid development of the course of disease and serious influence of the outcome. However, early stages of high myopia are often characterized by low to moderate myopia and are therefore easily overlooked, resulting in a failure to intervene early. Therefore, early warning and high myopia risk finding are important.
The current methods, such as adjusting by wearing glasses and performing operations on eyes, are post-treatment, and have poor treatment effects. Interventions to slow the progression of myopia in children have been successful in randomized clinical trials and there is a need to determine which individuals (particularly children) are most likely to benefit from therapeutic intervention. According to the general knowledge of experts (2017) paying attention to prevention and control of high myopia, some high-risk people need to focus on prevention and control to prevent and treat irreversible damage to vision. The high myopia-associated genetic risk assessment provides a risk early warning method that may precede any known conventional indication, thereby helping individuals to take early active preventative measures to slow or reduce the development of high myopia.
To date, over 100 genes and 20 chromosomal loci have been identified to be associated with high myopia (Verhoven et al 2013 a; Fan et al, 2012; Li and Zhang, 2017) by linkage analysis, candidate gene analysis, Genome-wide Association Study (GWAS) and Next Generation Sequencing (NGS), a number of studies revealing that high myopia is a complex disease in a multigenic context. Although some inventions are combined based on the articles, the gene risk prediction of the high myopia is known to be based on the research background of the caucasian population at present, and the related research based on the Chinese population in history only exists in a small sample and has limited efficacy; this results in reduced risk prediction performance due to population differences.
Genome-wide association studies (GWAS) can determine associations between Single Nucleotide Polymorphisms (SNPs) and phenotypic traits, such as high myopia in the present invention. The GWAS approach is universally applied in the disease health field, and many common genetic variations associated with complex diseases have been identified. Most genetic variations contribute little to disease risk and are not sufficient to directly predict disease status alone, but their cumulative effects are more discriminating and evaluating the risk of disease.
Polygenic Risk Score (PRS) is a statistical method that assesses the genetic Risk of a disease or trait based on the genotype profile of an individual, i.e., the cumulative effect of multiple Risk sites. In the classical PRS method, PRS is calculated as the sum of the effect value weights of all risk alleles of an individual. Research finds that the prediction capability of PRS calculated by using a large number of SNPs is better than that of SNPs only reaching GWAS significance, but because not all loci influence the interesting traits, and the detection cost is high due to the detection of a large number of genetic loci, and the PRS is not suitable for large-scale risk screening, how to find the optimal prediction PRS of complex disease risk becomes the focus of research at present. Selecting more suitable candidate gene loci and PRS construction methods to integrate the genetic variations to construct a more predictive high myopia polygene genetic risk score is helpful for developing high myopia early risk prediction to guide primary prevention.
The prior published patent literature adopts a gene detection method to detect the susceptibility to high myopia: 1) chinese patent CN 107385036A detects three sites of T239C, C298T and C893A of ARR3 gene; 2) detecting the SNP loci rs189798 and rs10034228 by Chinese patent CN 105861714A; 3) chinese patent CN 108179187A utilizes MYP11 gene mutation site rs10034228, GJD2 gene mutation site rs524952 and RSPO1 gene mutation site rs4074961 to predict the high myopia risk; 4) chinese patent CN 102732607A detects six pathogenic sites of ZNF644 gene; 5) chinese patent CN 109750103 a uses 24 gene loci.
Common methods for SNP genotyping are: a Taqman probe method, a SNaPshot method, a flight time mass spectrum (MALDI-TOF MS) typing, an HRM (high resolution melting curve) typing and other technical platforms. Among them, the accuracy rate of SNP detection based on a time-of-flight mass spectrometry (MALDI-TOF) can reach 99.9%, and the method is most attractive and has cost performance in addition to the advantages of high accuracy, strong flexibility, large flux, short detection period and the like. The time-of-flight mass spectrometry platform (MALDI-TOF) is an international universal research platform for gene Single Nucleotide Polymorphism (SNP), and the method has become a new standard in the field by virtue of the scientificity and accuracy of the method. The advantages and disadvantages of some SNP typing methods are given in Table 1 below.
TABLE 1
Figure BDA0003285972150000021
In summary, the existing invention patent and detection method still have the following disadvantages:
1) gene or gene molecular loci are focused on several genes or SNP loci according to available literature sources or public databases, and locus selection has limitation on complex traits with polygenic genetic structures (such as high myopia related to the invention). For example, six pathogenic sites of the ZNF644 gene can be detected only in a few high myopia patients;
2) the genetic risk assessment system is lacked or simple and genetic risk assessment system is adopted after locus gene detection and typing, only simple addition, combination, accumulation and the like are carried out, the related locus weight values come from documents and public databases, but the risk assessment system constructed by the genetic risk assessment system is offset and has errors due to differences of race, experimental design difference, calculation method and the like;
3) the performance and data of assessing genetic risk prediction in an independent cohort are not explicitly disclosed, and further validation of discriminative power and scope of applicability in risk assessment is required;
4) the defects can be well overcome based on the whole genome association research (GWAS) and the construction of multi-gene risk score, but the existing only high myopia genetic risk score model is simple to construct. When selecting gene locus, most of the gene locus selection and model training are carried out according to a pruning and reading single strategy method, and the problems that cursing is won, LD relationship, interaction, nonlinearity and other relationships among loci cannot be well characterized exist;
5) almost all genetic scores and large-scale whole genome association researches related to high myopia are constructed based on European population, and epidemiological characteristics of the high myopia are different from species to species (compared with western population, east Asian population, particularly Chinese population, has much higher proportion of myopia and high myopia), because of different species and genetic structures, the existing multi-gene genetic risk scores constructed based on the European population are not suitable for Chinese population; therefore, based on a large-scale Chinese crowd queue with high myopia, it is important to construct a PRS with high myopia and strictly evaluate the genetic risk prediction value of the PRS in the crowd with an independent queue;
6) the current mainstream detection technology is based on high-throughput genome-wide re-sequencing (i.e., re-sequencing all about 30 hundred million bases of the human genome and comparing with a reference genome to determine the site of variation) or exome-wide sequencing (determining the sequence of all genes on the human genome that can be expressed as proteins, i.e., exons); although the whole genome re-sequencing and whole exome sequencing technologies have high accuracy, the detection cost and the time period are very long, the cost of each sample of the whole genome re-sequencing is expensive, the subsequent biological information analysis needs long time, the whole locus sequence does not need to be detected aiming at the high myopia genetic risk assessment, and the cost is extremely high. Although the genotyping technology based on the fluorescence quantitative PCR technology can be satisfied in timeliness, the flux is low, only one site of one sample can be detected in each detection, for example, 86 detection reactions are needed to realize the 86 sites, and the detection cost is increased. In the case of a large sample size, the detection time of the fluorescence quantitative PCR technology will be greatly prolonged. Therefore, it is necessary to establish a highly myopic risk assessment and detection kit with good efficiency, high throughput and low cost to realize a rapid large-scale population highly myopic risk screening.
Disclosure of Invention
The invention aims to provide a high myopia gene detection kit, a high myopia genetic risk assessment system and a high myopia genetic risk assessment method, so that the problem that the high myopia gene detection method in the prior art cannot accurately, quickly and cheaply screen the high myopia risk of large-scale crowds due to deviation and errors caused by limitations of gene locus selection, ethnic difference, experimental design difference, calculation method and other problems is solved.
In order to solve the problems, the invention adopts the following technical scheme:
according to the first aspect of the invention, the high myopia gene detection kit is provided, and the kit can be used for carrying out SNP typing on 86 gene loci related to high myopia susceptibility risks at the same time. The 86 loci are shown below: rs1339000, rs10779363, rs783623, rs882879, rs2352096, rs3108476, rs61548163, rs79468746, rs2741293, rs1728288, rs1452186, rs322700, rs117449253, rs147806089, rs61659428, rs73110528, rs11723242, rs13138132, rs36187983, rs139039638, rs2583612, rs9312984, rs256310, rs 8030, rs35590232, rs10458140, rs 858, rs3756772, rs6930157, rs 2969301347, rs 121281, rs 2706948415, rs182270494, rs77003675, rs 7387240, rs 2323124, rs 99242744, rs13230744, rs 90054, rs 787894, rs 656579797979797779777946, rs 4277797779777946, rs 4277567946, rs 42775679775639, rs 42777977569862, rs 427356989, rs 72797756982, rs 72982, rs 42775635989, rs 4277797756989, rs 427356982, rs 427356300, rs 427356989, rs 427356300, rs 7256300, rs 7279437356300, rs 427356300, rs 7256300, rs 7279437356300, rs 72798, rs 7279437356300, rs 7256300, rs 7279437356300, rs 7256300, rs 7279437356300, rs 72798, rs 7279437356300, rs 727945, rs 72798, rs 7279437356300, rs 727945, rs 7279437356300, rs 72798, rs 7279437356300, rs 427356300, rs 7279437356300, rs 727945, rs 7279437356300, rs 727945, rs 7279437356300, rs 437356300, rs 427356300, rs 7279437356300, rs 7279.
The kit also comprises 86 pairs of PCR amplification primer pairs for amplifying 86 gene loci, the sequences are shown in the specification, the sequence directions of the amplification primers are from 5 'end to 3' end, the related SNP positions refer to hg19, and the rs number is from db 151.
Wherein, the 86 gene loci and 86 pairs of PCR amplification primer pairs are specifically shown as follows:
1 amplification primer pair one (1: 42366658 site): a- - - > G (wild type is A, mutant is G, the same applies below)
SEQ ID NO.1 (upstream primer, same as below): CAAGAAATTACAGACATGATT
SEQ ID NO.2 (downstream primer, same below): GGACGTCAGGATGGATAAGTG
2 amplification primer pair No. two (1: 54154726 site): g- - - - - > T
SEQ ID NO.3:GTCACAGTAGTCTTAGAACTC
SEQ ID NO.4:CCCTCACAGGGTGGGGCAGGT
3 amplification primer pair III (1: 164247961 site): a- - - > C
SEQ ID NO.5:AAGATGTAATTTATATATCTC
SEQ ID NO.6:AAATCAATGCATAAATATCAG
4 amplification primer pair four (1: 219786890 site): c- - - > T
SEQ ID NO.7:AATACAAACAGATGCAGAAAG
SEQ ID NO.8:TATACACACACAGAATCAAAA
5 amplification primer pair five (position 2: 157376170): t- - - > C
SEQ ID NO.9:TTGATACACATAAAAGAATAC
SEQ ID NO.10:GTGGGACTTGCGACAGGGGTG
6. Amplification primer pair number six (position 2: 178781124): c- - - > A
SEQ ID NO.11:AGCTGAGAAGTCTTCCTCCTT
SEQ ID NO.12:CAGACCCACGTTTCAGTCTCT
7. Amplification primer pair seven (position 2: 184349492): g- - - > A
SEQ ID NO.13:TTGTTTGAGTTTCTTGTAGAT
SEQ ID NO.14:TAGATATTGGAAAGTTTTCAG
8. Amplification primer pair eight (position 2: 233376715): a- - - > G
SEQ ID NO.15:ATGCTAAGGACCCCTGGGACT
SEQ ID NO.16:AGCTGGCTAAAGGGGTGTAGG
9. Amplification primer pair nine (position 2: 233384133): t- - - > C
SEQ ID NO.17:AAATTCACTCACCAGAATCAT
SEQ ID NO.18:CACTGTGGAACAGCTGGCTCT
10. Amplification primer pair ten (position 2: 236665271); c- - - > T
SEQ ID NO.19:GCTGATAAAGATGTACGTGAG
SEQ ID NO.20:GCTTGCAAGTGAGCATTACTC
11. Amplification primer pair eleven (position 3: 25354688): a- - - > G
SEQ ID NO.21:TGTTTGTTGTTTTTGTTTTTG
SEQ ID NO.22:TCAAATGTAGATAAACCAGAC
12. Amplification primer pair twelve (position 3: 29312653): a- - - > G
SEQ ID NO.23:TTCTAGATAGTACTTTATTGT
SEQ ID NO.24:GTATTGCTCTATATCAGAAAC
13. Amplification primer pair thirteen (position 3: 68668739): g- - - > A
SEQ ID NO.25:CAGGGCAAGATTCTGTGGGAC
SEQ ID NO.26:TTTCTTAAAGGTTATATTTTG
14. Amplification primer pair fourteen (position 3: 68945717): g- - - > A
SEQ ID NO.27:AGTAAGTGAGCAAAATGTTAT
SEQ ID NO.28:GAGAAGAAAAAGTGGCAAAAC
15. Amplification primer pair fifteen (position 3: 115352358): a- - - > G
SEQ ID NO.29:CCTATATGTCCACATAGTCCA
SEQ ID NO.30:TTATTGGTAGGGTTATTATCA
16. Sixteen amplification primer pairs (position 3: 87217907): c- - - > T
SEQ ID NO.31:CAATTACTCTCACCCATCATC
SEQ ID NO.32:TCTCACACGGTCTTCCATGCA
17. Seventeen amplification primer pairs (position 4: 8750251): a- - - > G
SEQ ID NO.33:CAAGCCCACACCTGCTCAGAG
SEQ ID NO.34:GGACAGGGTAGACTCCTGCGG
18. Eighteen amplification primer pairs (4: 44930082 site): g- - - > A
SEQ ID NO.35:CTAATACATTTTTAGCCACGT
SEQ ID NO.36:TTTTGAAGGGAAGTGCAGACA
19. Nineteen amplification primer pairs (position 4: 81950520): a- - - > C
SEQ ID NO.37:AGGGATAAATGCAATAGACAT
SEQ ID NO.38:GTGTGGTTATCTGGCCTGAGG
20. Amplification primer pair No. twenty (site 4: 118917282): c- - - > T
SEQ ID NO.39:CTCTCTGTTTCATCCACACTG
SEQ ID NO.40:AACACAAATATTTTTCCTCAG
21. Amplification primer pair twenty-first (position 4: 120612718): c- - - > T
SEQ ID NO.41:AATTTACATAAGTAGGAAAGT
SEQ ID NO.42:ACCCTGATGGAGGCAGGGGCT
22. Amplification primer pair twenty-two (5: 233188 site): c- - - > T
SEQ ID NO.43:AAGCAGCTAGAACTGCCCATA
SEQ ID NO.44:TAGTCACCACAGTATCATAGT
23. Amplification primer pair twenty-three (position 5: 34615822): a- - - > C
SEQ ID NO.45:TGACTTGGCGGGATCACTGTC
SEQ ID NO.46:GAGAATTTCAAGATGATGAGA
24. Amplification primer pair twenty-four (position 5: 80008704): c- - - > T
SEQ ID NO.47:TTTCAAATAATGCCCTTAGTC
SEQ ID NO.48:TCTGTCCTACTCAGTCAAGAA
25. Twenty-five amplification primer pairs (6: 2367903 site): g- - > T
SEQ ID NO.49:CAGTGTTGGTAGCTGACAGTC
SEQ ID NO.50:CTGTTCGTAATGGTATGATGC
26. Amplification primer pair twenty-six (6: 2439304 site): t- - - > C
SEQ ID NO.51:CCCCAGAACTTAATAAAACCT
SEQ ID NO.52:GAGCCGATAAGGTAAAATGAA
27. Amplification primer pair twenty-seven (6: 43806609 site): g- - - > A
SEQ ID NO.53:CCACAGAAGTCAGAGTGCTGT
SEQ ID NO.54:TCAGGGAGGCAAGGGGCTTTG
28. Amplification primer pair twenty eight (6: 73592597 site): t- - - > C
SEQ ID NO.55:TTCTGCTGGGTTGGTGCTGAT
SEQ ID NO.56:TTTCATGTGGGAAAAATAGAT
29. Amplification primer pair twenty-nine (6: 73629566 site): a- - - > G
SEQ ID NO.57:AGGAAGCCTGCCTTTGGTTAT
SEQ ID NO.58:GTCTGTTGCTTGGAGCCAAGA
30. Thirty amplification primer pairs (6: 84325184 site): c- - - > T
SEQ ID NO.59:AGGTCCTTCTGTCACAGGAAC
SEQ ID NO.60:AACTAGGGCATCTCTTATTGT
31. Amplification primer pair thirty one (6: 116325142 site): c- - - > T
SEQ ID NO.61:CTTTCTCTGATTAGAAAGGAA
SEQ ID NO.62:ACCAATTAACATTCATTTTTT
32 amplification primer pair thirty-two (position 7: 85830272): g- - - > A
SEQ ID NO.63:GTACTTCCTCTCCCACTAATT
SEQ ID NO.64:ATATATATGTTGGCTCCAAAA
33. Amplification primer pair thirty-three (position 7: 87397345): c- - - > T
SEQ ID NO.65:AGGGAGAAAACTTTTCACCCA
SEQ ID NO.66:CCAGGTGATAGGCACACTTTT
34. Amplification primer pair thirty-four (position 7: 99797792): a- - - > G
SEQ ID NO.67:AAAAAAGAAATCTCGTGGGAG
SEQ ID NO.68:AGTGACTCTCATTCCAGGGCT
35. Amplification primer pair thirty-five (position 7: 158763881): g- - - > A
SEQ ID NO.69:ACTCTGCCACTCTTCCCATCA
SEQ ID NO.70:AAGGCCAGCACATTGAGGCTG
36. Amplification primer pair thirty-six (position 7: 158864215): t- - - > C
SEQ ID NO.71:GGCACTGGCTTTTCCATGGGC
SEQ ID NO.72:ACAGGAGTTCAATTCAGCGGT
37. Amplification primer pair thirty-seven (position 7: 158934102): a- - - > G
SEQ ID NO.73:GGAAGAAGCCTCGGAGGCAGA
SEQ ID NO.74:GAGACCGCTGGCCGCCTGTGG
38. Amplification primer pair thirty-eight (position 7: 158992603): t- - - > C
SEQ ID NO.75:TGCTTTTGTATATCTCTTACA
SEQ ID NO.76:TAGGAAGCATTCAGGAGGCCC
39. Amplification primer pair thirty-nine (8: 40723038 site): c- - - > A
SEQ ID NO.77:GGATCTTCCAGGCAGGATGTG
SEQ ID NO.78:GCCATGCTGCACAGCCAGCTG
40. Amplification primer pair forty (8: 60137077 site): a- - - > G
SEQ ID NO.79:TTAAAAAAGCATACCATTAAT
SEQ ID NO.80:TTTCTATTTTTCTATTACCAT
41. Amplification primer pair forty-one (8: 121600998 site): a- - - > G
SEQ ID NO.81:CTGTAAAATGGGGATATTACT
SEQ ID NO.82:CTGCCAGTTACAAATTCAGAA
42. Amplification primer pair forty-two (9: 3928115 site): c- - - > T
SEQ ID NO.83:AGGTAAGTAATGCATCATCAA
SEQ ID NO.84:AATATAGTTTCTAAAATGTGA
43. Forty-three amplification primer pairs (9: 77417406 site): t- - - > C
SEQ ID NO.85:AAAGCAGTGCCTCTTTAGGAT
SEQ ID NO.86:GAGCAGTACGATGAGTGCTGA
44. Amplification primer pair forty-four (9: 101908915 site): a- - - > G
SEQ ID NO.87:AATGGGCTTAGTATTCTGGGA
SEQ ID NO.88:TGAATGAAATTATGTACAGTC
45. Forty-five amplification primer pairs (9: 130683501 site): t- - - > C
SEQ ID NO.89:GTCGGACCACGGACCCTTGCC
SEQ ID NO.90:GCCCCCAGGACCCCTGCTTCT
46. Amplification primer pair forty-six (10: 16973346 site): t- - - > C
SEQ ID NO.91:CATCAAATTTAGTCATAAAAA
SEQ ID NO.92:GAAGGAACCTTATTGAGCATC
47. Forty-seven amplification primer pairs (10: 60335179 site): t- - - > C
SEQ ID NO.93:GTTTCAGTGTTCTACCACCCA
SEQ ID NO.94:TTATGAAAGTTCCACTAGTAC
48. Amplification primer pair forty-eight (10: 102631263 site): a- - - > G
SEQ ID NO.95:CAGCTCATCCACCCTCCCTCC
SEQ ID NO.96:AGCACACCTCTCTTCAAGATG
49. Amplification primer pair forty-nine (10: 104591152 site): a- - - > T
SEQ ID NO.97:TGGGAAACCCAGCTGTGAAGA
SEQ ID NO.98:GGGGTGGTGGAGCAGAGTCCA
50. Amplification primer pair fifty (11: 16440714 site): a- - - > G
SEQ ID NO.99:ATCAATCAATGCAACTTGGCA
SEQ ID NO.100:TATGATTATCAAATTTATGGT
51. Amplification primer pair fifty one (11: 16789646 site): c- - - > G
SEQ ID NO.101:GGAGGTGTGAGCTAGGACTGC
SEQ ID NO.102:TCCTTGCTGGGTAGACCTCCT
52. Amplification primer pair fifty two (11: 30028361 site): c- - - > A
SEQ ID NO.103:TCAACACATGCTTAATGAAAA
SEQ ID NO.104:CCCCAGCACAGGTCAAAATAG
53. Fifty-three amplification primer pairs (11: 40155064 site): g- - > T
SEQ ID NO.105:CATATTTATAACCCCAGGCGA
SEQ ID NO.106:CTCTAACGCTATACCTACCAG
54. Amplification primer pair fifty-four (11: 126257083 site): c- - - > T
SEQ ID NO.107:TCTAGGGTCTGAGATCAGCTG
SEQ ID NO.108:CCTGACACCTGGAAAGGTCGT
55. Fifty-five amplification primer pair (11: 128759271 site): a- - - > C
SEQ ID NO.109:CATACACATTTTCTGTTGCTT
SEQ ID NO.110:TTGTCTGTGCTCTTTGAGAGG
56. Amplification primer pair fifty-six (12: 12879254 site): g- - > T
SEQ ID NO.111:TCATTTTTATTGAATTAAATT
SEQ ID NO.112:CTGGGTACGGAAATTCTTTTG
57. Fifty-seven of the amplification primer pairs (12: 56114769 site): c- - - > G
SEQ ID NO.113:TGCTCCATACAGCAGGTCTGT
SEQ ID NO.114:TCTGCCAACCAGATAATTTCT
58. Amplification primer pair fifty eight (12: 71246815 site): t- - - > C
SEQ ID NO.115:TTACCGTGGCAAATTTCTATC
SEQ ID NO.116:GGACTTTTATTTTATGGAGGA
59. Amplification primer pair fifty-nine (12: 123909289 site): t- - - > C
SEQ ID NO.117:AGCTAGTTTAAATCGGACTTC
SEQ ID NO.118:ACTGGTAAATTTACCCCCATG
60. Amplification primer pair sixty (12: 125811056 site): a- - - > G
SEQ ID NO.119:CAATGAAGACAATATGAGCTC
SEQ ID NO.120:CTGGGGGGTGTTGAATAAAGC
61. Amplification primer pair sixty-one (position 13: 22305099): t- - - > C
SEQ ID NO.121:GCAAGGCCAAGGGAGAGCAGC
SEQ ID NO.122:CCTATTCACAACCTGCCCTTG
62. Amplification primer pair sixty-two (position 13: 50141345): g- - - > A
SEQ ID NO.123:TCAAGAGACTCTTACCTCATC
SEQ ID NO.124:CCACTTTCCGACATCCACCAT
63. Amplification primer pair sixty-three (position 13: 99576446): g- - - > A
SEQ ID NO.125:GGGTTAAGCCAATCCAATGGG
SEQ ID NO.126:GTTTAAAAAGAGGCAATCGGG
64. Amplification primer pair sixty-four (position 13: 105628137): g- - - > A
SEQ ID NO.127:TGGCAAGTGTAGGAGATGGAA
SEQ ID NO.128:CATATTTTAAATAAAATGATA
65. Amplification primer pair sixty-five (14: 54316424 site): c- - - > T
SEQ ID NO.129:AGGGGAGCATCTCCTGGTGGG
SEQ ID NO.130:TGAAATCTGGTGAAATAGATC
66. Amplification primer pair sixty-six (14: 54419106 site): c- - - > A
SEQ ID NO.131:GGGCTAGAAATGGAGGGGCAA
SEQ ID NO.132:CCCTGTAATTACTTGGTCTAA
67. Amplification primer pair sixty-seven (14: 75041449 site): a- - - > G
SEQ ID NO.133:GAGATTTGTTTAGGGCTGGAC
SEQ ID NO.134:GGTTGTCATCAGAGAGTTTGA
68. Amplification primer pair sixty eight (15: 35008335 site): g- - - > A
SEQ ID NO.135:TACTCTCTCCCTGGATTGACT
SEQ ID NO.136:AAACTGACACGTTTCCTCGCA
69. Amplification primer pair sixty-nine (15: 35025496 site): g- - - > A
SEQ ID NO.137:CATGGCAATTGATGGTAGAAA
SEQ ID NO.138:GAGCAGTTTCTAAATTGGCTG
70. Amplification primer pair seventy (15: 63571121 site): c- - - > T
SEQ ID NO.139:TAACCAAGTTTCTAGCTAGAT
SEQ ID NO.140:GTCTCACCAAGGAAAATGCAC
71. Amplification primer pair seventy-one (15: 68449779 site): a- - - > C
SEQ ID NO.141:CTCTATGGATAGAGCTCTGTG
SEQ ID NO.142:TTCCACTGAAATTGCTCTTTC
72. Amplification primer pair seventy-two (15: 79372875 site): a- - - > G
SEQ ID NO.143:AGTTTGGCTAAGAGCCCAAGA
SEQ ID NO.144:ACCAGCTAAGGTTACCAAAAA
73. Amplification primer pair seventy-three (15: 80248207 site): a- - - > G
SEQ ID NO.145:CCCAGCCTCTCAACACTGTTG
SEQ ID NO.146:GAAATGACATAATCAAAACCA
74. Amplification primer pair seventy-four (15: 87161911 site): c- - - > T
SEQ ID NO.147:ATGTCACATGTATAGTGCTAA
SEQ ID NO.148:AAAATGGGAAAGTGGATCATA
75. Amplification primer pair seventy-five (16: 73127556 site): a- - - > G
SEQ ID NO.149:ACCGGTCGTCTCCTGCAAACA
SEQ ID NO.150:CCCTCCTGGCGGCGTTCCCAG
76. Amplification primer pair seventy-six (16: 84575278 site): c- - - > T
SEQ ID NO.151:AGGCATGAGCCACTGTGTCCA
SEQ ID NO.152:AATAAGTCTAAAATAGAAAGA
77. Seventy-seven amplification primer pair (16: 86629047 site): a- - - > G
SEQ ID NO.153:TTTTTGGTGCCACCCTTTAGA
SEQ ID NO.154:TGGCCCCAACCTTGGTGACAC
78. Amplification primer pair seventy eight (16: 87297624 site): g- - - > A
SEQ ID NO.155:AACTAAGGATGAGGAACACAG
SEQ ID NO.156:CACACTGGATTTCAAAGGCTT
79. Amplification primer pair seventy-nine (17: 11407901 site): g- - - > A
SEQ ID NO.157:CAGCATTCCTAGCCTTGGCCA
SEQ ID NO.158:GTTTATTTACAACTTAATCCA
80. Eighty amplification primer pair (17: 54735307 site): a- - - > G
SEQ ID NO.159:ATGTTCCAATGACAAATAATC
SEQ ID NO.160:GTTGTTGTTTTAAGAAATAAC
81. Amplification primer pair eighty one (17: 56584508 site): c- - - > T
SEQ ID NO.161:AGGAGGGGGCCTGCTCCCTGG
SEQ ID NO.162:GGGTCCAGGGCACAGGCTTTA
82. Amplification primer pair eighty two (19: 5092450 site): t- - - > G
SEQ ID NO.163:GCCACCCATTGAAGACACCTG
SEQ ID NO.164:TCCGGGGGGGACACATCTCCC
83. Eighty-three amplification primer pairs (19: 7231000 site): a- - - > G
SEQ ID NO.165:TGTGCAGAAGAGAGACTATCT
SEQ ID NO.166:GGGAAGCCAACTCCCTTGGCC
84. Eighty-four amplification primer pairs (21: 16495307 site): g- - - > A
SEQ ID NO.167:TCTCATCTATAATATAAAATA
SEQ ID NO.168:AGCTATTATATATCTTAAATC
85. The amplification primer pair was eighty-five (21: 16523219 site): g- - - > A
SEQ ID NO.169:GTGTTGCATTTATTTGAGTTA
SEQ ID NO.170:ATTTTGGCATATTAAATCATC
86. Amplification primer pair No. 86 (21: 16568524 site): t- - - > C
SEQ ID NO.171:ACAGGGGCCATGCAATATGAT
SEQ ID NO.172:TCTTTGGAACTTGGTTTTCTA。
The detection kit for the high myopia genetic risk genes provided by the invention also comprises 86 extension primers for sequentially identifying the 86 gene mutation sites, wherein the sequence directions are from 5 'end to 3' end, and the sequences of the extension primers are as follows:
SEQ ID NO.173:GGGTCAAGGCAGGATGTGGGGGACCG
SEQ ID NO.174:CCCCTGCTTTCAATAGCACTTTGTGG
SEQ ID NO.175:TTAGAAAATCCTAAAGACTACCAAAA
SEQ ID NO.176:AAGAACCTTAAAGTAGTTATAGCTGC
SEQ ID NO.177:AGTTAGTCATTAAGATTGTGGGATCT
SEQ ID NO.178:CGAATTCCCTCTGCCTCTGATTTCAC
SEQ ID NO.179:TGCATAGTTTGCAAACATTGAGAAGG
SEQ ID NO.180:ACCTCAAAGTCCCATAGGCCTGGGAG
SEQ ID NO.181:TCCTGAAGTAGAGGTATTGTCATGGC
SEQ ID NO.182:TGGATTGTTTATTTGCATCCCAGCAT
SEQ ID NO.183:TGGAAGTTCCCCTCTTATGGGAATAG
SEQ ID NO.184:ATTTAAATTTTCCCGCATGCCAAAAG
SEQ ID NO.185:GTTGGGGTTTTATTTTCTGAGTGACG
SEQ ID NO.186:GTTTTTTAACCTCTCTCTAGAGCTGA
SEQ ID NO.187:GAAGGAACTTATCTCAAGTGTGGTGG
SEQ ID NO.188:GGTATAGATCATCAGCTTAACTCAGT
SEQ ID NO.189:AATTTGCGATGGCACGTTGGTAAGAA
SEQ ID NO.190:CCTTAAATGTTAGTATCTAGACTCTG
SEQ ID NO.191:GTCAAGAACTGCACTAAACATTTACA
SEQ ID NO.192:CCATTGGGCCTTCTGCCTTAAACAGT
SEQ ID NO.193:GGGTAGAGATGGAAAGGGATATGAAC
SEQ ID NO.194:AAAATAATCATTATTACTGTCCTTGT
SEQ ID NO.195:CAAGTGGGACCAGCCACATAATTTTA
SEQ ID NO.196:CTACTGTGAGCACTGATAAAATTAAT
SEQ ID NO.197:CTGCCTTTAGTGGAAGAGAATGGCTT
SEQ ID NO.198:TAAAGTGGATAACCAATAAAATAGAC
SEQ ID NO.199:TGTGGCCAGGCTGCTGTTATGCAATG
SEQ ID NO.200:CATGGAAGTTTGACAAGAGTGTACGC
SEQ ID NO.201:GACGTTCCATGCCCAAGATGGATGTA
SEQ ID NO.202:TCCATTTGCTTAAATCCTAGGCAACT
SEQ ID NO.203:AGTTGTTTCTCTGCATCTGATCTTCC
SEQ ID NO.204:TCATCAAACCACTTGATGTTCAACTG
SEQ ID NO.205:CGATCTGTACTCTATTGCAGGAATGT
SEQ ID NO.206:GGCAGATAAGGTCATGGGGAGGAGGA
SEQ ID NO.207:CTGATGTGTCCATGCAATTTGCCTTA
SEQ ID NO.208:CTCAGTTGTGCCTGGAGCTCGGACCC
SEQ ID NO.209:CGGTGGCTCAGAGGGGGATGGGAACG
SEQ ID NO.210:TTTGTTGTATTTTATTAGTATTAGCC
SEQ ID NO.211:GAGTGAGCCCAGGCCTCTCTGCTCCA
SEQ ID NO.212:ATTATTTATCTTAAATATATTGTGGG
SEQ ID NO.213:TGGAATTAGGTAATGTATTTCAAGGG
SEQ ID NO.214:TGACTCGCTTTTAAAAAATCAGTGCC
SEQ ID NO.215:TGAAAGATGGGTGGGTAGGGGGAGAC
SEQ ID NO.216:TGGTAAATTGCTCTCCTCTCCCCCAG
SEQ ID NO.217:ACTTTCAGAGGAGGGCAGAGCTCACC
SEQ ID NO.218:ACAATACTCACCAACTGACACAGACC
SEQ ID NO.219:AGCCCTGAGAAGTCATTACTATCTCC
SEQ ID NO.220:GAACTAGGCTCCAAGTTCTCACGACA
SEQ ID NO.221:GAGGGTGTCAACAGGTCCGTATAGTT
SEQ ID NO.222:TTTTTCATGCCATTTGCACATTACTA
SEQ ID NO.223:GATAACTCACTTGCAAGTTCGCTCAG
SEQ ID NO.224:CTGTTTAAAACAAACTTGACAAAGCA
SEQ ID NO.225:TTTTATAGTAGAGTGCGGGGAAAGAT
SEQ ID NO.226:GAGCTACATTGTCTGCTCATACAACT
SEQ ID NO.227:AGACAATGAAAGACATACTTAGCACC
SEQ ID NO.228:AGAGCGTTTTTGCGGGAGGAATATGT
SEQ ID NO.229:TTAGTTAAGTTAGCCACAAATACAGG
SEQ ID NO.230:CTACCGTTCACGGGCCAGGGGCCGCC
SEQ ID NO.231:TGACCAAAATGAAACATTTCAATTAC
SEQ ID NO.232:ATTCCGTTGTCCTGGCAACCTGTATA
SEQ ID NO.233:GCTCTAAGAAGGGATGTAGACATGGC
SEQ ID NO.234:ACTGGCTGAGGTGCCGAAGACACACG
SEQ ID NO.235:GGATTGGGTAAATATAGAACTAAAGG
SEQ ID NO.236:GTTTGATTGAGGATTCTTTGCTTGAG
SEQ ID NO.237:GTTCCAAGACAGGTGGCTAGGCATTC
SEQ ID NO.238:AAGTTTGTGTCTTCTCCCTCACACCA
SEQ ID NO.239:TCTCCTGAAACCAAACCCAGGATGCG
SEQ ID NO.240:GAAAAGTTAAATGTGCATAATAATCA
SEQ ID NO.241:TCACATCAGGCTGAGTTACAAGACTG
SEQ ID NO.242:GATGTACTTGCAAGTAGCAGGGTCAT
SEQ ID NO.243:TTTTTGTGAGAGTAGTCTACACTTGC
SEQ ID NO.244:TTTCACTCATATAGTCTCACAGGAAA
SEQ ID NO.245:AGTTTTGGAGCAAACCATAGCACATG
SEQ ID NO.246:AGAACTGCTAAACCAGAGAGTGTGGT
SEQ ID NO.247:ACTCGGCCTCTCCCCACAGAATACAG
SEQ ID NO.248:CCTTATGCCTTGAATAAAGCCTTATT
SEQ ID NO.249:GCAGATGAAAAAAACTGAGGTTGAGG
SEQ ID NO.250:AGAAAATGGTTCGTACTGTGGCTTCA
SEQ ID NO.251:AAGTATCTATCAAATCACATGCATCG
SEQ ID NO.252:TTGGGGACTACCTTCAGAGTCTTCAA
SEQ ID NO.253:CCAAAGTCAGCATCACACGCCTCGCT
SEQ ID NO.254:TTCCTTCCTTTCTCCCACTGCTCTCG
SEQ ID NO.255:TGCCAAGGAGGCGCGTGTCTACAGCG
SEQ ID NO.256:TGTTTTAAGGATTAAATGATAGCACG
SEQ ID NO.257:CAATGGTAGTTTTATGTTTCAATTCA
SEQ ID NO.258:AAAAAGGACTGGACATTTGTGCTGGC
according to the present invention, preferably, the reaction system of the PCR amplification of the kit is as follows:
Figure BDA0003285972150000121
Figure BDA0003285972150000131
according to the present invention, preferably, the kit further comprises an SAP reaction system, which is as follows:
SAP buffer 0.17. mu.L
SAP enzyme 0.5U
ddH2The content of O is filled to 2 mu L.
The kit adopts a time-of-flight mass spectrometer for detection. Firstly, a target sequence is amplified simultaneously in a system through multiplex PCR, then SNP sequence specific extension primers are added, and 1 base is extended on SNP sites. The prepared sample analyte and a chip matrix are co-crystallized and then are excited by strong laser in a vacuum tube of a mass spectrometer, nucleic acid molecules are desorbed into singly charged ions, the ion flight time in an electric field is inversely proportional to the ion mass, and the accurate molecular weight of the sample analyte is obtained by detecting the flight time of the nucleic acid molecules in the vacuum tube, so that the SNP site information is detected.
According to the kit provided by the invention, the related gene loci are gene molecule loci generated based on the research of large-scale Chinese Han nationality population high myopia queues, and do not relate to any existing literature and research; and selecting a group of SNPs with the highest myopia predictability as gene molecular markers according to the collected related data. The specific method for selecting the locus is a construction method of a high myopia genetic risk prediction model, and SNP (single nucleotide polymorphism) contained in a feature set of a selected optimal model is used as a molecule to be detected of a kit; the SNP locus sets related to the kit are common variation loci with the minimum allele frequency not less than one thousandth (refer to east Asian people), so that the wide crowd applicability of the kit is ensured; designing a corresponding primer group with high specificity can carry out accurate genotyping on the high myopia related risk genes. The method of using the time-of-flight mass spectrometer ensures the accuracy and the sensitivity of the detection result, and is simple and feasible.
According to a second aspect of the present invention, there is provided a high myopia genetic risk assessment system comprising: the acquisition module is used for acquiring a DNA sample of a tester; a detection module comprising a high myopia gene detection kit as described above for genotyping the DNA sample; the prediction module comprises a high myopia genetic risk prediction model used for scoring the high myopia genetic risk of the tester; and an evaluation module comprising a high myopia risk rating system for determining a high myopia genetic risk rating for the test subject.
According to a preferred embodiment of the present invention, the high myopia genetic risk prediction model integrates risk gene locus information to predict and quantitatively score the high myopia genetic risk by using genome-wide association analysis and machine learning algorithm.
The main use scene of the high myopia genetic risk prediction model is the genotyping data measured by the high myopia genetic risk gene detection kit, and the risk scoring is carried out on the high myopia genetic risk by combining the sex information. The model can also be used for risk scoring when the whole genome data of a detected person is available and the gender is known, and other scenes such as gene loci related to the characteristics of the model are included.
The high myopia genetic risk prediction model performs performance evaluation in an independent population with sufficient sample size, the AUC value is 0.62-0.63, and the current AUC value is an excellent level in the current multi-gene prediction field; the case/control ratio (OR) between the highest percentile and the middle percentile is 3; when the classification threshold was selected to be 0.5, the Sensitivity (Sensitivity) was 0.89 and the positive accuracy rate (PPV) was 0.69; global Accuracy (Accuracy): is 0.67. And the classification threshold line is improved, so that the detection sensitivity and positive accuracy of the high myopia group can be improved. The model also has certain prediction and discrimination capability for people with moderate myopia (more than 300 degrees and less than 600 degrees), and the AUC value of the model is 0.55.
The characteristic ID and model weight and risk allele information contained in the high myopia genetic risk prediction model of the present invention are shown in table 2 below (referring to SNP position reference hg19, rs numbering from db 151).
TABLE 2
Figure BDA0003285972150000132
Figure BDA0003285972150000141
Figure BDA0003285972150000151
The gender weight value used in the high myopia genetic risk prediction model was 0.319727. When the high myopia genetic risk prediction model is used, encoding is carried out strictly according to the dose number of risk alleles in genotype weight (for example, rs7275695, the risk alleles are C, the genotype CT is encoded into 1, the genotype is CC and encoded into 2, and the genotype is TT and encoded into 0).
According to the high myopia genetic risk prediction model provided by the invention, the used characteristics and weights are obtained by training based on big data and machine learning and genomics related methods, the data is used as drive, the high predictability is used as a target, and the deviation caused by directly using some characteristics and weights of the existing research is avoided; multigene scores in combination with gender, age, or other environmental factors are the primary direction of study for the current risk score; strict performance evaluation is carried out in an independent crowd queue, and the distinguishing capability and the application range of the model in risk evaluation are verified.
According to a preferred scheme of the invention, the adopted high myopia genetic risk grading system cuts and groups risk scores obtained by a high myopia genetic risk prediction model, and prompts the risk degree according to the high myopia prevalence rate of the collected data set.
The division interval is: grade 1: [0-0.1706408): low risk, grade 2: [0.1706408-0.2526106): low risk, grade 3: [0.2526106-0.3388352): general risk, grade 4: [0.3388352-0.4370253): higher risk, grade 5: [0.4370253-0.5569740): high risk, grade 6: [0.5569740-1]: high risk. The risk level goes from low to high for a total of 6 levels.
The system provides an overall high myopia risk level proportion distribution among the general population and the population at a particular risk level has actually developed into a high myopia proportion. The method comprises the following specific steps: grade 1: 11.6 percent; grade 2: 19.4 percent; grade 3: 23.6 percent; grade 4: 20.7 percent; grade 5: 16.5 percent; grade 6: 8.3 percent.
The actual distribution of myopia for the population at each particular risk level is shown in table 3 below.
TABLE 3
Figure BDA0003285972150000161
According to the high myopia genetic risk grading system provided by the invention, individual risk grades and degree descriptions are graded on the high myopia genetic risk scores obtained by the model, and the overall risk distribution of Chinese people under a macroscopic view angle and the actual development of people under the same risk grade into a high myopia proportion are provided. The system not only provides a qualitative risk degree early warning result, but also can provide more digital quantitative distribution results obtained based on actual data, so that a detected person can better perceive the risk per se.
According to a third aspect of the present invention, there is provided a high myopia genetic risk assessment method, as shown in fig. 1, comprising the steps of: s1: obtaining a DNA sample of a tester; s2: genotyping said DNA sample using a high myopia genetic test kit as described above; s3: constructing a high myopia genetic risk prediction model, and obtaining a high myopia genetic risk score of a tester by using the high myopia genetic risk prediction model; s4: and determining individual risk level or overall risk distribution of the crowd or actual high myopia distribution in a specific risk level group according to different testers by using a high myopia risk level classification system.
In step S3, the construction of the high myopia genetic risk prediction model includes the following steps: s31: acquiring high myopia phenotype data of the volunteers in a myopia questionnaire mode; s32: acquiring gene data of volunteers participating in a myopia questionnaire; s33: preprocessing and set dividing are carried out on collected questionnaire and gene data, and a training set, a verification set and a test set are divided according to a machine learning processing mode; s34: performing whole genome association analysis on the training set to obtain GWAS statistical data of the high myopia, wherein the GWAS statistical data comprises site numbers and significance P values of the site numbers and the significance P values associated with the high myopia; s35: combining every two according to GWAS statistical data based on locus linkage correlation r2 and significance P value, and filtering locus sets; s36: based on each filtered locus set, feature selection is carried out in a self-adaptive mode by using a feature selection algorithm in machine learning, the features filtered by the feature selection algorithm are used, a logistic regression algorithm is applied to a training set, a high myopia genetic risk prediction model is established, and prediction and performance evaluation are carried out on an independent verification set; s37: based on the predicted performance on the verification set, selecting the model and the optimal gene locus which perform best on the verification set according to the AUC value; s38: and applying the optimal model to an independent test set for prediction to obtain the prediction performance, risk division capability and population risk distribution of the model in the general population.
Step S32 includes: and (3) performing genotyping on a high-throughput gene chip based on an Axiom precision medical research array by using a GeneTitan multi-channel instrument platform to acquire the gene data.
According to a fourth aspect of the present invention, there is also provided a use of a high myopia genetic risk assessment system in assessing myopia genetic risk, in particular the level of high myopia genetic risk.
According to the invention, firstly, the high myopia gene detection kit is provided, the defects of the existing high myopia gene detection are overcome, the kit has the advantages of high detection efficiency, large detection flux and low detection cost, the high myopia genetic risk level general screen of large-scale crowds is solved, and early discovery and early prevention of myopia are realized; secondly, an optimal high myopia genetic risk prediction model is constructed and selected by a machine learning Method (ML) through large-scale global genome association analysis (GWAS) of the high myopia cohort population, a high myopia genetic risk assessment system specially aiming at Chinese population is provided, and a corresponding high myopia genetic risk assessment method is provided. By combining the high myopia gene detection kit, the high myopia genetic risk prediction model can better exert practical value. The invention realizes the deep excavation of the genetic information related to the high myopia from the gene level to distinguish the high-myopia, middle-myopia and low-risk groups.
In summary, the high myopia gene detection kit, the high myopia genetic risk assessment system and the high myopia genetic risk assessment method provided by the invention have the following significant advantages compared with the prior art:
1) according to the genetic research result of large-scale high myopia (more than 5 ten thousand people participate, so far, the largest-scale high myopia research of Chinese people queues), the found susceptible sites and the site set used by the model are more suitable for Chinese people; all are data driven and do not involve and are based on public data sets and studies published by others. The new combinations of sites we present are not mentioned in any of the documents, and are obtained by analysis of a large number of raw data;
2) and (3) constructing a risk prediction model by using a machine learning algorithm and implementing a feature selection algorithm and an optimal model selection algorithm in training and verifying two independent sets, and compared with the traditional PRS model construction method, the risk prediction method has higher prediction performance and fewer model sites. Displaying the overall risk prediction performance and risk distribution of the model in an independent test set with large sample amount;
3) the high myopia genetic risk prediction model not only can have good prediction on high myopia groups, but also has certain prediction capability on medium myopia, and has the advantage of allowing high-risk individuals to be identified from any age after birth;
4) compared with the technologies such as fluorescent quantitative PCR, gene chip, Sanger sequencing, NGS and the like, the time-of-flight mass spectrometry detection method adopted by the invention has the characteristics of rapidness, high sensitivity, low cost, simplicity and feasibility, and is suitable for the detection of multiple genes and multiple sites. Eye drops or body evaluation is not needed, the high myopia genetic risk condition is predicted only by using a saliva sample test, the SNP typing accuracy is high, the detection rate can reach more than 95%, and the accuracy rate is more than 98%.
Drawings
FIG. 1 shows a schematic diagram of a high myopia genetic risk assessment method provided in accordance with the present invention;
FIG. 2 is a detailed technical schematic diagram of the construction process of a high myopia genetic risk prediction model and the design process of a gene detection kit provided by the invention, wherein the source and the method of a high myopia susceptibility gene locus set are particularly defined;
FIG. 3 shows a logical view of a myopia study questionnaire;
figure 4 shows a GWAS quality control flow diagram;
a and b in fig. 5 show manhattan graph, qq graph of GWAS statistics, respectively;
FIG. 6 shows a model predicted performance AUC plot for gene only and gene plus gender;
FIG. 7 is a box plot showing the distribution of genetic risk scores for the case (high myopia, right grey) and control (non-myopia, left black) populations in a separate test set consisting of a high myopia and non-myopia population;
FIG. 8 shows a 50-quantile plot of independent test sets consisting of highly myopic and non-myopic populations;
FIG. 9 shows a 50-quantile plot of independent test sets consisting of high myopia, moderate myopia and non-myopia populations;
FIG. 10 shows a primer design diagram of a high myopia genetic risk gene detection kit provided by the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1: construction of high myopia genetic risk prediction model
This embodiment provides a predictive model for assessing high myopia genetic risk that encompasses the high myopia genetic polymorphic sites of chinese han population, including 86 SNP sites and gender features of human genomic DNA. The steps of constructing the high myopia genetic risk assessment model of the invention are shown in fig. 2.
1) High myopia phenotype data is acquired. Designing a myopia questionnaire, pushing and issuing the questionnaire, filling the questionnaire by a research volunteer voluntarily, and collecting high myopia phenotype data.
2) Genetic data was obtained for participants who participated in the near vision questionnaire. Obtaining the locus and the genotype information thereof in the whole genome range by a biological information analysis means through an SNP chip or sequencing.
3) Collected questionnaires and gene data were preprocessed and clustered. According to the questionnaire filling data, the myopia is divided into four categories of non-myopia, low myopia, moderate myopia and high myopia. The experimental paradigm of "case control group" was used to control non-myopia and low myopia, and high myopia was used as case group. And dividing a training set, a verification set and a test set according to the machine learning processing mode. The three sets are independent of each other and do not contain the same individuals.
4) And performing whole genome association analysis on the training set to obtain high myopia GWAS statistical data, wherein the GWAS statistical data comprises site numbers and significance P values of the site numbers and the point numbers associated with the high myopia.
5) And according to GWAS statistical data, performing pairwise combination (P _ T strategy) based on the locus linkage correlation r2 and the significance P value to perform locus set filtering.
6) Feature selection (including SNPs and genders) is performed adaptively using a feature selection algorithm in machine learning based on each filtered set of loci. And (3) using the features filtered by the feature selection algorithm, applying a logistic regression algorithm in the training set to establish a risk prediction model of the high myopia, and performing prediction and performance evaluation on an independent verification set.
7) Based on the predicted performance on the validation set, the model and optimal gene locus that performed best on the validation set were selected based on the AUC values.
8) And applying the optimal model to an independent test set for prediction to obtain the prediction performance, risk division capability and population risk distribution of the model in the general population.
Wherein, in step 1), a myopia questionnaire is designed, and the problem logic diagram is shown in fig. 3: wherein "what is your highest fitting power? ", designed in two forms: a roller fill pattern (selectable between 0-4000 degrees) or a single selection (below 300 degrees, between 300-600 degrees, 600 degrees and above and unclear). "what is your highest fitting power? "question is an answer to the question" do you have or have myopia? The "yes" jump question (answer the question "no" if you have or have myopia. The questionnaire form can be an offline questionnaire (the form of filling in a roller is changed into the form of filling in a hand) or an online electronic questionnaire form. This embodiment employs a web electronic questionnaire that can be piggybacked in an app named "myopia study" for enrollment of subjects. The subject is required to complete the filling out of the questionnaire. A total of 55345 questionnaires were collected. Wherein the ages of the male and female are 24591/30754 years and 0-80 years. Data may be exported as a txt file. Not limited to all of the problems of fig. 3, but preferably includes the problems described above.
In step 2), after participants participating in the high myopia questionnaire deposited 2ml of saliva in a saliva storage tube, they were sent to a laboratory, purified the DNA in the saliva, and genotyped on a high throughput genechip based on Axiom advanced medical research array (Affymetrix) using GeneTitan multichannel instrument platform (Thermo Fisher Scientific). The gene chip covers approximately 90 ten thousand SNPs of 22 autosomes, 2 sex chromosomes and a mitochondrial genome. After pipeline production processing through Affy gene data, the data is output in a binary format of plink.
In step 3) -1, pretreatment and quality control of questionnaire data and gene data were performed, respectively. For questionnaire data, any version of open source statistical software R is used for quality control and processing of questionnaire results. Extract "age" in the questionnaire, "whether or not you are nearsighted," "what is your highest prescription? "three problem data. The 18-70 samples were filtered and the remaining 54284 fills, excluding the fill "what are your highest fitting power? Unclear "samples, excluding 496 samples. Excluding "if you are or have too much myopia", answer "yes", but "what is your highest fitting degree? Degree 0 "samples, finally 53741 fills out remain. According to the selected highest lens prescription, the '0-300 degrees' is classified as low myopia, the '300 + 600 degrees' is classified as medium myopia, the 'above 600 degrees' is classified as high myopia, and the 'whether you have myopia or not-whether you have myopia or not' is classified as non-myopia. In performing subsequent analysis, we have encoded and categorized: case control paradigm. Since the high myopia studied is more hereditary and the low myopia is more of an environmental impact, we used the low myopia and non-myopia populations as controls, coded as 0; the high myopia population was coded as case group (case) 1, the medium myopia population as other type (other), coded as 3, after completion case: 10900 human, control: 21691 people.
In the steps 3) -2, the processed high myopia phenotype data is divided into a data set, a training set, a verification set and a test set according to the 7&1& 2. When the data sets are divided, a hierarchical sampling method is used, and function createFolds () of a third party expansion packet 'caret' of R can be used for carrying out layering according to a field containing information whether the myopia is high, so that the proportion of case vs control vs other of the three data sets is consistent, and distribution deviation does not exist.
In steps 3) -3, the gene data are respectively matched according to the sample numbers of the processed questionnaire data, and the gene data can be filtered based on the sample information by using a bioinformation common software Plink-keep command. After the division is completed, a training set is 21320 persons, a test set is 6106 persons, a verification set is 3055 persons, and a person which is matched with the test set in proportion and only contains other 3916 persons.
The quality control of the gene training set is specifically shown in FIG. 4, all operations related to autosomes can be completed by selecting corresponding commands in the Plink software, and all operations related to the pseudo-autosomal sites are completed by the commands corresponding to the XWAS software. The method comprises the following steps of removing a site with MAF less than 0.001, removing samples and sites with sample deletion rate and site deletion rate more than 0.02, removing sites which do not meet Harvard equilibrium, removing samples with close relationship, removing samples of non-Chinese Han nationality, and using MDS analysis instead of PCA analysis. After quality control, the total number of autosomal and pseudoautosomal SNPs was 394519. The number of non-pseudonormally-stained SNPs was 6853. No impute was performed and only the raw typing data was used for subsequent analysis.
In step 4), performing genome-wide association analysis on the phenotype training set and the gene training set, adopting a case control coding form, and using a logistic regression (logistic) algorithm and an addtive (addtive) model to regard the SNP containing 2 normal alleles as the number of risk alleles of the SNP to be 0; regarding a SNP containing 1 normal allele as the number of risk alleles of the SNP of 1; an SNP containing 0 normal alleles is considered as the number of risk alleles of the SNP to be 2. Covariates were the demographics PC1-PC5 and gender to control false positive associations resulting from demographics and gender stratification. And obtaining high myopia GWAS statistical data, wherein the GWAS statistical data comprises a significance P value associated with the high myopia. Will be used as the key indicator for the next site filtration. Subsequent analyses were performed using only autosomal and pseudo-autosomal SNPs.
Manhattan plots and qq plots of GWAS statistics are shown as a and b in fig. 5, respectively.
In step 5), screening site sets by using pruning and reading method based on the P value of the high myopia GWAS statistical data and the linkage r2 between training set sites, wherein the screening standard is that the P value can be set to 5e-4, 5e-5 and 5e-8, the linkage r2 value can be set to 0.2, 0.5, 0.8 and 0.95, and the two screening standards are combined two by two to total 12 (3 x 4) to screen SNP sets. Each combination results in a set of gene loci. The results are shown in Table 4 below.
TABLE 4
Figure BDA0003285972150000191
Figure BDA0003285972150000201
In step 6), the process is carried out in 4 steps.
Step 6) -1: combining a phenotype training set and a gene training set according to sample codes, wherein the number of the risk alleles of the SNP is 0 when the SNP containing 2 normal alleles is regarded as the SNP; regarding a SNP containing 1 normal allele as the number of risk alleles of the SNP of 1; an SNP containing 0 normal alleles is considered as having a number of risk alleles of 2 (ensuring agreement with the genotype coding pattern in step 4, i.e. ensuring agreement between the normal and risk alleles). The high myopia codes are coded according to a "case control paradigm", case code 1, control code 0. The reading and encoding of gene data can be performed using the open source statistics software R and the read.plink () function in the third party software package snpStats.
6) -2: extracting corresponding subsets of the training set according to the SNP locus set filtered by the P _ T strategy, and adding gender (gender) as a candidate prediction factor, wherein the male code is 1, and the female code is 2.
6) -3: and (4) filtering the locus set again by using a feature selection algorithm commonly used for self-adaptive selection machine learning according to the r2 and the P value combination in the P _ T strategy. Three common feature selection algorithms are used to eliminate possible null sites, effect micro-sites and redundant sites with collinearity. The basic principle of the self-adaptation is as follows: the algorithm of P continuously selecting at the level of 5 x 10 < -5 > is lasso, Boruta of random forest class and stepwise regression Step; the algorithm selected by P at the 5 x 10-8 level is lasso and stepwise regression Step, and the algorithm selected by P at the 5 x 10-4 level is only lasso. Wherein lasso uses the cv.glmnet function in the third-party software package glmnet in R (parameters: family ═ binary ", type. measure ═ auc", standard ═ TRUE, nfold ═ 5), and takes the feature with regression coefficient not being 0 as the output feature; wherein the Boruta of the random forest class uses lasso for feature selection using the Boruta (), TenticationRoughFix () and getSelectdAttributes () functions in the third party software package Boruta in R. Stepwise regression Step was selected directly using a built-in Step (fit) function, which is a logistic regression model with unfiltered sites fitted with a glm function. And taking the characteristics contained in the finally step-by-step regression finalized model as an output characteristic set.
6) -4: using the features selected by each feature selection algorithm in each combination (18 in total, the combination of the feature selection algorithm and the P _ T strategy), a logistic regression model was established for high myopia, the dependent variable was 0/1-encoded high myopia status composed of case and control, and the independent variable was a gene training data set composed of feature-selected locus sets.
In step 7), the 18 models are applied to independent verification sets, and the AUC values commonly used in machine learning are taken as classification performance of the evaluation model in the verification sets, as shown in table 5 below.
TABLE 5
Figure BDA0003285972150000202
Figure BDA0003285972150000211
The model with the optimal classification performance is the model with the highest AUC value in the verification set. Wherein the test set and the verification set are additionally filtered (not containing missing values of any sites, respectively filtering 1310 samples and 648 samples), so as to ensure the accuracy of verification and test. The selected P _ T combination is: p5 x 10-5, r2 0.2, the selected feature selection method: lasso ", the number of features selected is: 87(86 SNPs + gene). The molecular markers and molecular weights involved in the optimal high myopia genetic risk assessment model are presented in the summary section.
Step 8), in order to verify the high myopia prediction and risk discrimination ability of the optimal model in the population. And applying the optimal model to an independent test set to evaluate the prediction performance of the optimal model. The AUC: 0.62-0.63.
Prediction performance ROC plots ROC curve AUC values for the set of highly myopic and non-myopic samples for the gene + gender (grey) and gene only (black) models are shown in fig. 6.
Boxplots of the genetic risk score distributions for the case (high myopia, grey right) and control (non-myopia, black left) populations in the independent test set consisting of the highly myopic and non-myopic population are shown in figure 7. The vertical axis represents the gene risk score, and the small circles in the two subgroups of the graphs represent the mean of the gene risk scores of the group. It can be seen that the genetic risk for the highly myopic population is overall higher than the score for the non-myopic population in the test set.
There are two schemes for constructing the test set:
1) the independent test set consisted of highly myopic and non-myopic populations, whose 50 quantile plot is shown in figure 8. Wherein, the upper graph is a relation graph of 50 quantiles of the prediction scores of the model and the high myopia proportion in a set consisting of high myopia samples and non-myopia samples, the x axis is quantile, and the y axis is the proportion of the high myopia samples; the lower panel is the quantile of the high myopia genetic risk score versus the high myopia proportion of the model in separate sets for men and women. Change in the ratio of prevalence between the highest percentile and the median risk score distribution-RR: 1.97. the change in case/control ratio between the highest percentile and the median risk score distribution-OR: 3; when the classification threshold is selected to be 0.5, the sensitivity is 0.89, and the ppv is 0.69; accuracy: is 0.67. The detection sensitivity of the high myopia group can be improved by increasing the classification threshold line.
2) The independent test set consisted of high myopia, moderate myopia and non-myopia groups, with a 50 point chart as shown in figure 9. Wherein, the upper graph is a relation graph of 50 quantiles of the prediction scores of the model and the proportion of the high myopia in a set consisting of high myopia, low and medium myopia and non-myopia samples, the x axis is quantile, and the y axis is the proportion of the high myopia samples; the lower panel is a plot of the fraction of the model in the male and female pool versus the high myopia proportion. Wherein the model also has a certain distinguishing capability for the population with moderate myopia, and the auc value is about 0.55.
Example 2: construction of high myopia genetic risk gene detection kit
The design process of the site and the primer of the related kit comprises the following steps:
and (3) selecting SNP sites in the high myopia genetic risk prediction model obtained in the embodiment 1 as gene molecular sites of the kit to form a PANEL detection model. The design of the upstream and downstream primer pairs and single-base extension primer sequences (shown in the invention) of the multiplex PCR amplification is carried out aiming at all SNP sites related to the established high myopia genetic risk level detection PANEL model, and the schematic design diagram of the primers is shown in FIG. 10.
Details regarding the kit:
1.1 major constituent
PCR reaction mixed liquor, Taq Enzyme, amplification primer mixed liquor, SAP Buffer, SAP Enzyme, single base extension reaction mixed liquor, iPlex Enzyme, extension primer mixed liquor, ddH2O, a positive control, desalting resin and a mass spectrum chip.
1.2 storage conditions
The product is stored at-20 deg.C.
1.3 the universal PCR instrument of the matched instrument; DR MassARRAY.
1.4 sample requirement
The product is suitable for extracting genome DNA from oral mucosa cells, oral exfoliative cells, saliva, blood and dried blood slices, and requires that the ratio of DNA A260/A280 is between 1.8 and 2.0. Frozen DNA samples should be below-20 ℃ and repeated freeze thawing is avoided.
1.5 test methods
First, PCR reaction
1) Related reagent components are sequentially added according to the following table 6 to prepare a 5uL PCR reaction system, and the reaction system is subpackaged into 96-hole PCR plates with 3 uL/hole. The PCR reaction system is shown in Table 6 below.
TABLE 6
Figure BDA0003285972150000221
2) The DNA template is taken out, thawed on ice (4 ℃), vortexed for 10s, and then briefly centrifuged, and a certain amount of DNA is sucked out and diluted to 5 ng/. mu.L for later use.
3) Add 2. mu.L of 5 ng/. mu.L DNA template to each well of 96-well plate, cover the tube, vortex for 10s, and briefly centrifuge for use, a blank (2. mu.L ddH) must be set for each experiment2O), negative control (2 μ L DNA extraction eluate) and positive control.
4) The 96-well PCR plate was placed in the amplification instrument and the program was run: pcr, specific procedure:
Figure BDA0003285972150000222
after completion, the mixture was kept at 4 ℃.
II, SAP reaction
After completion of the PCR reaction, an SAP mixture was prepared in a 1.5mL EP tube. The SAP reaction mixture comprises the following components:
SAP buffer 0.17. mu.L of CutSmart buffer (manufacturer NEB)
SAP enzyme 0.5U (manufacturer NEB)
ddH2The content of O is filled to 2 mu L.
1) To each well was added 2 μ L of SAP mixture (total volume after addition of mixture: 7 μ L).
2) The plates were sealed with a membrane (Life's or other company's better quality membranes), vortexed and centrifuged (4000rpm, 5 seconds).
3) Place the plate on a PCR instrument for the following procedures:
the temperature of the mixture is controlled to be 37 ℃ for 40 minutes,
the temperature of the mixture is 85 ℃ for 5 minutes,
keeping the temperature at 4 ℃.
Extension reaction
1) The SAP reaction plate was removed and centrifuged at 2000rpm for 1 min.
2) An iPLEX extension mix was prepared in a 1.5mL tube according to Table 7. The numbers in table 7 are calculated as a 96 well plate plus a 38% excess. Please adjust the number according to the actual number of responses.
TABLE 7
Single base extension reaction mixture (mixed by buffer + acyNTPs) 0.4 μ L (from NEB)
iPlex Enzyme 1U (produced by NEB)
Extension primer mixture 0.94μL
Water (W) Make up to 2 mu L
3) Add 2 μ L of iPLEX extension mix to each well and mix well (total volume after mix addition: 9 μ L).
4) The plates were sealed with a membrane, vortexed and centrifuged (4000rpm for 5 seconds).
5) The 96-well plate was placed on a PCR instrument for the following thermal cycling:
Figure BDA0003285972150000231
keeping the temperature at 4 ℃.
Fourthly, resin purification
1) Spreading clean Resin (Resin) on 96/15mg concave plate (double plate), scraping with a scraper blade, repeatedly pushing the Resin to level, compacting to make the Resin content in each hole uniform, and using when the Resin changes from dark yellow to light yellow.
2) To each well of the sample plate, 41. mu.L of water was added, and the membrane was sealed (using a common membrane), followed by centrifugation.
3) The sample plate is turned upside down gently and placed on the crater plate with the resin placed thereon, and the holes of the two plates correspond one to one. The crater plate, along with the sample plate, is then inverted (the two plates are not horizontally movable during the process) to allow the resin to fall into the wells.
4) The plates were sealed with a membrane (using a common membrane) and placed on a rotator and shaken upside down for 15 minutes.
5) Plates were centrifuged for 5 minutes at 3200g (4000rpm of standard plate centrifuge).
Fifth, mass spectrometry
Mass spectrometry was performed using DR MassArray.
1) And opening the software of the plate management system, editing an experiment plan file, wherein the experiment plan file comprises the position of the sample, the name of the sample and the used primer, and connecting the mass spectrometer with the established experiment plan file.
2) The Start All icon is clicked, the software is started, and the various indicator lights are checked for normality.
3) Click the "chip tray enter/exit" button to place the chip on the tray and then on the chip deck, record the chip position (1 on the left and 2 on the right). Hands do not touch the surface of the chip; placing the 96 plate at the position marked with MTP1/2, and fixing the 96 plate in the direction of A1 at the lower left corner; when the chip is used for the first time, 75 mu L of calibration standard substance is added into the sample adding slot of the calibration substance, and when the chip is not used for the first time, the calibration standard substance does not need to be added. Then click the 'chip tray enter/exit' button and close the clamp plate.
4) Click the "add/maintain resin" button, open the resin tank, add resin or supplement autoclaved purified water. A. When the instrument is started for the first time, 28g of resin is required to enter a resin tank, and 16mL of sterilized purified water is added and mixed uniformly. B. When the resin is used for the first time, 9g of resin is completely poured into a resin tank, 5.2ml of autoclaved purified water is added, and the mixture is uniformly mixed by a gun head. C. When the water is not used for the first time, the liquid level needs to be observed, if the liquid level of the water is lower than the resin surface, a proper amount of high-pressure sterilization purified water needs to be supplemented, and the liquid level of the water is higher than the resin surface. D. The resin solution is added into the resin tank and is used as soon as possible, and can not be placed for more than 30 days.
5) The program set-up parameters are shown in table 8 below.
TABLE 8
Figure BDA0003285972150000241
6) After finishing printing the mass spectrum, clicking a button for removing the old chip from the analyzer, returning the chip to a chip deck, then clicking a button for entering/exiting a chip tray, taking out a 96-well plate, sealing a film and storing at-20 +/-5 ℃; the chip is put back into the packaging box and stored in a dehumidifier (the chip is used as soon as possible after being opened, the storage time does not exceed 30 days), the calibration standard sample is recovered and stored at minus 20 plus or minus 5 ℃, then a button for entering/exiting the chip tray is clicked, and the clamping plate is closed.
1.6 interpretation of test results
And (3) judging the effectiveness: the standard substance can detect the corresponding genotype, the blank control substance (ddH2O) has no signal detection, when the weak positive control substance can detect the corresponding positive signal, the detection result is valid, otherwise, the detection result is invalid.
Example 3 application to assessment of high myopia and the genetic risk level of myopia
Selecting 1 saliva sample of the children with high myopia, numbering as mofang-hi myopia _01, setting 3 compound holes for each sample, and carrying out SNP typing on a nucleic acid mass spectrum platform, wherein the process is as follows:
1.1 extraction of DNA
The DNA extraction comprises the following steps:
1) the 2ml centrifuge tube was removed and marked with the sample number.
2) 10uL of FineMag Particles G, 30. mu.L of protease K and 350. mu.L of Buffer MLD were added to each tube.
3) Transfer 500 μ L of saliva sample to centrifuge tube, mix 3 times by inversion, vortex at high speed for 15 sec. Placing into a constant temperature incubator at 45 deg.C for 10 min.
4) The centrifuge tube was placed on a magnetic stand and allowed to stand for 30sec, and the liquid was carefully removed after the magnetic beads were completely adsorbed.
5) The centrifuge tube was removed from the magnetic frame, 600. mu.L of Buffer RBP was added, and the beads were resuspended by vortexing for 15 sec.
6) The centrifuge tube was placed on a magnetic stand and allowed to stand for 30sec, and the liquid was carefully removed with a pipette when the magnetic beads were completely adsorbed.
7) Repeating the operation steps 6) and 7) once.
8) The centrifuge tube was removed from the magnetic frame, 600. mu.L of 80% ethanol was added, and the beads were resuspended by vortexing for 15 sec.
9) And (3) placing the centrifugal tube on a magnetic frame, standing for 30sec, and removing liquid by using a pipette after the magnetic beads are completely adsorbed.
10) The centrifuge tube was removed from the magnetic frame, 600. mu.L of absolute ethanol was added, and the beads were resuspended by vortexing for 15 sec.
11) The centrifuge tube was placed on a magnetic stand and allowed to stand for 30sec, and the liquid was carefully removed with a pipette when the magnetic beads were completely adsorbed.
12) Placing the centrifuge tube on a magnetic frame, opening the tube cover, and air drying at room temperature for 10 min.
13) The tube was removed from the magnetic stand, 50. mu.l of Buffer EB was added, the mixture was pipetted and mixed well, and incubated at 70 ℃ for 5 min.
14) And (4) after short-time centrifugation, placing the centrifugal tube on a magnetic frame and standing for 1-2min until the magnetic beads are completely adsorbed. After the beads were fully adsorbed, the DNA solution was transferred to a new labeled 0.6ml centrifuge tube.
1.2 obtaining SNP typing data and coding
The extracted DNA was typed and output the test results using the high myopia gene test kit and time-of-flight mass spectrometer, TYPER4.0 software, according to the experimental procedures provided in example 2.
The genotyping results of mofang-hignmyopia _01 are shown in Table 9 below.
TABLE 9
Figure BDA0003285972150000251
Figure BDA0003285972150000261
Figure BDA0003285972150000271
After SNP genotyping is carried out on a person to be examined (female) by a high myopia genetic risk detection kit, gender information of a detection sample is obtained, coding is carried out according to the number of risk alleles (normal purity is 0, heterozygosity is 1, and risk mutation purity is 2) according to the risk allele of each SNP locus and the gene detection result of the locus, which are clearly shown in a model, the gender male code is 1, the gender female code is 2, and the coded result is shown in the third column in the upper table 6.
Calculating the formula: the high myopia genetic risk score is SNP _1 × Coffi _1+ SNP _2 × Coffi _2+. the. + SNP _86 × Coffi _86+ gene.
As disclosed above, the feature-coded values are multiplied by the corresponding weight values in the model, then summed, and added with the intercept term (-1.685245), and the resulting score is the high myopia genetic risk score for the subject.
3.3 Risk rating the resulting scores according to a Risk partitioning System
By integrating the genetic data and gender information through the model, we can obtain a genetic risk score of 0.6532. A high myopia risk classification system is matched, and risk grading is carried out according to the given segmentation intervals (0-0.1706408, 0.1706408-0.2526106, 0.2526106-0.3388352, 0.3388352-0.4370253, 0.4370253-0.5569740 and 0.5569740-1) of the invention. The individual high myopia gene risk of the moving-highest myopia _01 is the highest level of grade 6, the risk degree is 'high risk group', and the risk degree is obviously higher than the high myopia prevalence rate. The population proportion distribution for each risk level is shown in table 10 below.
Watch 10
Risk rating Percentage of the population
Class
1 11.6
Class
2 19.4
Class
3 23.6
Class
4 20.7
Grade
5 16.5
Grade
6 8.3%
In the population at class 6, the proportion of highly myopic vs non-myopic population is: 53.8% vs 46.2%; in the population at class 6, the proportion of the highly myopic vs non-myopic vs moderately myopic population to the power of the population at class 6 is: 29.8% vs 25.7% vs 44.5%, the proportion of 29.8% being much greater than the prevalence of high myopia in the general population (< 20%), since some of the population in the medium myopia population did not develop high myopia when the questionnaire was filled, 74.3% of the population in the population has developed myopia above 300 degrees if viewed by the combined high and medium myopia population.
The above embodiments are merely preferred embodiments of the present invention, which are not intended to limit the scope of the present invention, and various changes may be made in the above embodiments of the present invention. All simple and equivalent changes and modifications made according to the claims and the content of the specification of the present application fall within the scope of the claims of the present patent application. The invention has not been described in detail in order to avoid obscuring the invention.
SEQUENCE LISTING
<110> Chengdu twenty three magic cube Biotechnology Ltd
<120> high myopia gene detection kit, and high myopia genetic risk assessment system and method
<160> 258
<170> PatentIn version 3.5
<210> 1
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 1
caagaaatta cagacatgat t 21
<210> 2
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 2
ggacgtcagg atggataagt g 21
<210> 3
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 3
gtcacagtag tcttagaact c 21
<210> 4
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 4
ccctcacagg gtggggcagg t 21
<210> 5
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 5
aagatgtaat ttatatatct c 21
<210> 6
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 6
aaatcaatgc ataaatatca g 21
<210> 7
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 7
aatacaaaca gatgcagaaa g 21
<210> 8
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 8
tatacacaca cagaatcaaa a 21
<210> 9
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 9
ttgatacaca taaaagaata c 21
<210> 10
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 10
gtgggacttg cgacaggggt g 21
<210> 11
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 11
agctgagaag tcttcctcct t 21
<210> 12
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 12
cagacccacg tttcagtctc t 21
<210> 13
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 13
ttgtttgagt ttcttgtaga t 21
<210> 14
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 14
tagatattgg aaagttttca g 21
<210> 15
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 15
atgctaagga cccctgggac t 21
<210> 16
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 16
agctggctaa aggggtgtag g 21
<210> 17
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 17
aaattcactc accagaatca t 21
<210> 18
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 18
cactgtggaa cagctggctc t 21
<210> 19
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 19
gctgataaag atgtacgtga g 21
<210> 20
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 20
gcttgcaagt gagcattact c 21
<210> 21
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 21
tgtttgttgt ttttgttttt g 21
<210> 22
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 22
tcaaatgtag ataaaccaga c 21
<210> 23
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 23
ttctagatag tactttattg t 21
<210> 24
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 24
gtattgctct atatcagaaa c 21
<210> 25
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 25
cagggcaaga ttctgtggga c 21
<210> 26
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 26
tttcttaaag gttatatttt g 21
<210> 27
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 27
agtaagtgag caaaatgtta t 21
<210> 28
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 28
gagaagaaaa agtggcaaaa c 21
<210> 29
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 29
cctatatgtc cacatagtcc a 21
<210> 30
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 30
ttattggtag ggttattatc a 21
<210> 31
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 31
caattactct cacccatcat c 21
<210> 32
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 32
tctcacacgg tcttccatgc a 21
<210> 33
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 33
caagcccaca cctgctcaga g 21
<210> 34
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 34
ggacagggta gactcctgcg g 21
<210> 35
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 35
ctaatacatt tttagccacg t 21
<210> 36
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 36
ttttgaaggg aagtgcagac a 21
<210> 37
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 37
agggataaat gcaatagaca t 21
<210> 38
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 38
gtgtggttat ctggcctgag g 21
<210> 39
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 39
ctctctgttt catccacact g 21
<210> 40
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 40
aacacaaata tttttcctca g 21
<210> 41
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 41
aatttacata agtaggaaag t 21
<210> 42
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 42
accctgatgg aggcaggggc t 21
<210> 43
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 43
aagcagctag aactgcccat a 21
<210> 44
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 44
tagtcaccac agtatcatag t 21
<210> 45
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 45
tgacttggcg ggatcactgt c 21
<210> 46
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 46
gagaatttca agatgatgag a 21
<210> 47
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 47
tttcaaataa tgcccttagt c 21
<210> 48
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 48
tctgtcctac tcagtcaaga a 21
<210> 49
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 49
cagtgttggt agctgacagt c 21
<210> 50
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 50
ctgttcgtaa tggtatgatg c 21
<210> 51
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 51
ccccagaact taataaaacc t 21
<210> 52
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 52
gagccgataa ggtaaaatga a 21
<210> 53
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 53
ccacagaagt cagagtgctg t 21
<210> 54
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 54
tcagggaggc aaggggcttt g 21
<210> 55
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 55
ttctgctggg ttggtgctga t 21
<210> 56
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 56
tttcatgtgg gaaaaataga t 21
<210> 57
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 57
aggaagcctg cctttggtta t 21
<210> 58
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 58
gtctgttgct tggagccaag a 21
<210> 59
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 59
aggtccttct gtcacaggaa c 21
<210> 60
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 60
aactagggca tctcttattg t 21
<210> 61
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 61
ctttctctga ttagaaagga a 21
<210> 62
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 62
accaattaac attcattttt t 21
<210> 63
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 63
gtacttcctc tcccactaat t 21
<210> 64
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 64
atatatatgt tggctccaaa a 21
<210> 65
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 65
agggagaaaa cttttcaccc a 21
<210> 66
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 66
ccaggtgata ggcacacttt t 21
<210> 67
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 67
aaaaaagaaa tctcgtggga g 21
<210> 68
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 68
agtgactctc attccagggc t 21
<210> 69
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 69
actctgccac tcttcccatc a 21
<210> 70
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 70
aaggccagca cattgaggct g 21
<210> 71
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 71
ggcactggct tttccatggg c 21
<210> 72
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 72
acaggagttc aattcagcgg t 21
<210> 73
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 73
ggaagaagcc tcggaggcag a 21
<210> 74
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 74
gagaccgctg gccgcctgtg g 21
<210> 75
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 75
tgcttttgta tatctcttac a 21
<210> 76
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 76
taggaagcat tcaggaggcc c 21
<210> 77
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 77
ggatcttcca ggcaggatgt g 21
<210> 78
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 78
gccatgctgc acagccagct g 21
<210> 79
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 79
ttaaaaaagc ataccattaa t 21
<210> 80
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 80
tttctatttt tctattacca t 21
<210> 81
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 81
ctgtaaaatg gggatattac t 21
<210> 82
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 82
ctgccagtta caaattcaga a 21
<210> 83
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 83
aggtaagtaa tgcatcatca a 21
<210> 84
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 84
aatatagttt ctaaaatgtg a 21
<210> 85
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 85
aaagcagtgc ctctttagga t 21
<210> 86
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 86
gagcagtacg atgagtgctg a 21
<210> 87
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 87
aatgggctta gtattctggg a 21
<210> 88
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 88
tgaatgaaat tatgtacagt c 21
<210> 89
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 89
gtcggaccac ggacccttgc c 21
<210> 90
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 90
gcccccagga cccctgcttc t 21
<210> 91
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 91
catcaaattt agtcataaaa a 21
<210> 92
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 92
gaaggaacct tattgagcat c 21
<210> 93
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 93
gtttcagtgt tctaccaccc a 21
<210> 94
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 94
ttatgaaagt tccactagta c 21
<210> 95
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 95
cagctcatcc accctccctc c 21
<210> 96
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 96
agcacacctc tcttcaagat g 21
<210> 97
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 97
tgggaaaccc agctgtgaag a 21
<210> 98
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 98
ggggtggtgg agcagagtcc a 21
<210> 99
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 99
atcaatcaat gcaacttggc a 21
<210> 100
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 100
tatgattatc aaatttatgg t 21
<210> 101
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 101
ggaggtgtga gctaggactg c 21
<210> 102
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 102
tccttgctgg gtagacctcc t 21
<210> 103
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 103
tcaacacatg cttaatgaaa a 21
<210> 104
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 104
ccccagcaca ggtcaaaata g 21
<210> 105
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 105
catatttata accccaggcg a 21
<210> 106
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 106
ctctaacgct atacctacca g 21
<210> 107
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 107
tctagggtct gagatcagct g 21
<210> 108
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 108
cctgacacct ggaaaggtcg t 21
<210> 109
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 109
catacacatt ttctgttgct t 21
<210> 110
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 110
ttgtctgtgc tctttgagag g 21
<210> 111
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 111
tcatttttat tgaattaaat t 21
<210> 112
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 112
ctgggtacgg aaattctttt g 21
<210> 113
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 113
tgctccatac agcaggtctg t 21
<210> 114
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 114
tctgccaacc agataatttc t 21
<210> 115
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 115
ttaccgtggc aaatttctat c 21
<210> 116
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 116
ggacttttat tttatggagg a 21
<210> 117
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 117
agctagttta aatcggactt c 21
<210> 118
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 118
actggtaaat ttacccccat g 21
<210> 119
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 119
caatgaagac aatatgagct c 21
<210> 120
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 120
ctggggggtg ttgaataaag c 21
<210> 121
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 121
gcaaggccaa gggagagcag c 21
<210> 122
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 122
cctattcaca acctgccctt g 21
<210> 123
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 123
tcaagagact cttacctcat c 21
<210> 124
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 124
ccactttccg acatccacca t 21
<210> 125
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 125
gggttaagcc aatccaatgg g 21
<210> 126
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 126
gtttaaaaag aggcaatcgg g 21
<210> 127
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 127
tggcaagtgt aggagatgga a 21
<210> 128
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 128
catattttaa ataaaatgat a 21
<210> 129
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 129
aggggagcat ctcctggtgg g 21
<210> 130
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 130
tgaaatctgg tgaaatagat c 21
<210> 131
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 131
gggctagaaa tggaggggca a 21
<210> 132
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 132
ccctgtaatt acttggtcta a 21
<210> 133
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 133
gagatttgtt tagggctgga c 21
<210> 134
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 134
ggttgtcatc agagagtttg a 21
<210> 135
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 135
tactctctcc ctggattgac t 21
<210> 136
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 136
aaactgacac gtttcctcgc a 21
<210> 137
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 137
catggcaatt gatggtagaa a 21
<210> 138
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 138
gagcagtttc taaattggct g 21
<210> 139
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 139
taaccaagtt tctagctaga t 21
<210> 140
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 140
gtctcaccaa ggaaaatgca c 21
<210> 141
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 141
ctctatggat agagctctgt g 21
<210> 142
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 142
ttccactgaa attgctcttt c 21
<210> 143
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 143
agtttggcta agagcccaag a 21
<210> 144
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 144
accagctaag gttaccaaaa a 21
<210> 145
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 145
cccagcctct caacactgtt g 21
<210> 146
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 146
gaaatgacat aatcaaaacc a 21
<210> 147
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 147
atgtcacatg tatagtgcta a 21
<210> 148
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 148
aaaatgggaa agtggatcat a 21
<210> 149
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 149
accggtcgtc tcctgcaaac a 21
<210> 150
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 150
ccctcctggc ggcgttccca g 21
<210> 151
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 151
aggcatgagc cactgtgtcc a 21
<210> 152
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 152
aataagtcta aaatagaaag a 21
<210> 153
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 153
tttttggtgc caccctttag a 21
<210> 154
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 154
tggccccaac cttggtgaca c 21
<210> 155
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 155
aactaaggat gaggaacaca g 21
<210> 156
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 156
cacactggat ttcaaaggct t 21
<210> 157
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 157
cagcattcct agccttggcc a 21
<210> 158
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 158
gtttatttac aacttaatcc a 21
<210> 159
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 159
atgttccaat gacaaataat c 21
<210> 160
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 160
gttgttgttt taagaaataa c 21
<210> 161
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 161
aggagggggc ctgctccctg g 21
<210> 162
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 162
gggtccaggg cacaggcttt a 21
<210> 163
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 163
gccacccatt gaagacacct g 21
<210> 164
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 164
tccggggggg acacatctcc c 21
<210> 165
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 165
tgtgcagaag agagactatc t 21
<210> 166
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 166
gggaagccaa ctcccttggc c 21
<210> 167
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 167
tctcatctat aatataaaat a 21
<210> 168
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 168
agctattata tatcttaaat c 21
<210> 169
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 169
gtgttgcatt tatttgagtt a 21
<210> 170
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 170
attttggcat attaaatcat c 21
<210> 171
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 171
acaggggcca tgcaatatga t 21
<210> 172
<211> 21
<212> DNA
<213> Artificial Synthesis
<400> 172
tctttggaac ttggttttct a 21
<210> 173
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 173
gggtcaaggc aggatgtggg ggaccg 26
<210> 174
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 174
cccctgcttt caatagcact ttgtgg 26
<210> 175
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 175
ttagaaaatc ctaaagacta ccaaaa 26
<210> 176
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 176
aagaacctta aagtagttat agctgc 26
<210> 177
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 177
agttagtcat taagattgtg ggatct 26
<210> 178
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 178
cgaattccct ctgcctctga tttcac 26
<210> 179
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 179
tgcatagttt gcaaacattg agaagg 26
<210> 180
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 180
acctcaaagt cccataggcc tgggag 26
<210> 181
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 181
tcctgaagta gaggtattgt catggc 26
<210> 182
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 182
tggattgttt atttgcatcc cagcat 26
<210> 183
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 183
tggaagttcc cctcttatgg gaatag 26
<210> 184
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 184
atttaaattt tcccgcatgc caaaag 26
<210> 185
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 185
gttggggttt tattttctga gtgacg 26
<210> 186
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 186
gttttttaac ctctctctag agctga 26
<210> 187
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 187
gaaggaactt atctcaagtg tggtgg 26
<210> 188
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 188
ggtatagatc atcagcttaa ctcagt 26
<210> 189
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 189
aatttgcgat ggcacgttgg taagaa 26
<210> 190
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 190
ccttaaatgt tagtatctag actctg 26
<210> 191
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 191
gtcaagaact gcactaaaca tttaca 26
<210> 192
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 192
ccattgggcc ttctgcctta aacagt 26
<210> 193
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 193
gggtagagat ggaaagggat atgaac 26
<210> 194
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 194
aaaataatca ttattactgt ccttgt 26
<210> 195
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 195
caagtgggac cagccacata atttta 26
<210> 196
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 196
ctactgtgag cactgataaa attaat 26
<210> 197
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 197
ctgcctttag tggaagagaa tggctt 26
<210> 198
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 198
taaagtggat aaccaataaa atagac 26
<210> 199
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 199
tgtggccagg ctgctgttat gcaatg 26
<210> 200
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 200
catggaagtt tgacaagagt gtacgc 26
<210> 201
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 201
gacgttccat gcccaagatg gatgta 26
<210> 202
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 202
tccatttgct taaatcctag gcaact 26
<210> 203
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 203
agttgtttct ctgcatctga tcttcc 26
<210> 204
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 204
tcatcaaacc acttgatgtt caactg 26
<210> 205
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 205
cgatctgtac tctattgcag gaatgt 26
<210> 206
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 206
ggcagataag gtcatgggga ggagga 26
<210> 207
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 207
ctgatgtgtc catgcaattt gcctta 26
<210> 208
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 208
ctcagttgtg cctggagctc ggaccc 26
<210> 209
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 209
cggtggctca gagggggatg ggaacg 26
<210> 210
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 210
tttgttgtat tttattagta ttagcc 26
<210> 211
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 211
gagtgagccc aggcctctct gctcca 26
<210> 212
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 212
attatttatc ttaaatatat tgtggg 26
<210> 213
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 213
tggaattagg taatgtattt caaggg 26
<210> 214
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 214
tgactcgctt ttaaaaaatc agtgcc 26
<210> 215
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 215
tgaaagatgg gtgggtaggg ggagac 26
<210> 216
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 216
tggtaaattg ctctcctctc ccccag 26
<210> 217
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 217
actttcagag gagggcagag ctcacc 26
<210> 218
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 218
acaatactca ccaactgaca cagacc 26
<210> 219
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 219
agccctgaga agtcattact atctcc 26
<210> 220
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 220
gaactaggct ccaagttctc acgaca 26
<210> 221
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 221
gagggtgtca acaggtccgt atagtt 26
<210> 222
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 222
tttttcatgc catttgcaca ttacta 26
<210> 223
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 223
gataactcac ttgcaagttc gctcag 26
<210> 224
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 224
ctgtttaaaa caaacttgac aaagca 26
<210> 225
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 225
ttttatagta gagtgcgggg aaagat 26
<210> 226
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 226
gagctacatt gtctgctcat acaact 26
<210> 227
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 227
agacaatgaa agacatactt agcacc 26
<210> 228
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 228
agagcgtttt tgcgggagga atatgt 26
<210> 229
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 229
ttagttaagt tagccacaaa tacagg 26
<210> 230
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 230
ctaccgttca cgggccaggg gccgcc 26
<210> 231
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 231
tgaccaaaat gaaacatttc aattac 26
<210> 232
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 232
attccgttgt cctggcaacc tgtata 26
<210> 233
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 233
gctctaagaa gggatgtaga catggc 26
<210> 234
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 234
actggctgag gtgccgaaga cacacg 26
<210> 235
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 235
ggattgggta aatatagaac taaagg 26
<210> 236
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 236
gtttgattga ggattctttg cttgag 26
<210> 237
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 237
gttccaagac aggtggctag gcattc 26
<210> 238
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 238
aagtttgtgt cttctccctc acacca 26
<210> 239
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 239
tctcctgaaa ccaaacccag gatgcg 26
<210> 240
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 240
gaaaagttaa atgtgcataa taatca 26
<210> 241
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 241
tcacatcagg ctgagttaca agactg 26
<210> 242
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 242
gatgtacttg caagtagcag ggtcat 26
<210> 243
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 243
tttttgtgag agtagtctac acttgc 26
<210> 244
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 244
tttcactcat atagtctcac aggaaa 26
<210> 245
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 245
agttttggag caaaccatag cacatg 26
<210> 246
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 246
agaactgcta aaccagagag tgtggt 26
<210> 247
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 247
actcggcctc tccccacaga atacag 26
<210> 248
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 248
ccttatgcct tgaataaagc cttatt 26
<210> 249
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 249
gcagatgaaa aaaactgagg ttgagg 26
<210> 250
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 250
agaaaatggt tcgtactgtg gcttca 26
<210> 251
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 251
aagtatctat caaatcacat gcatcg 26
<210> 252
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 252
ttggggacta ccttcagagt cttcaa 26
<210> 253
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 253
ccaaagtcag catcacacgc ctcgct 26
<210> 254
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 254
ttccttcctt tctcccactg ctctcg 26
<210> 255
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 255
tgccaaggag gcgcgtgtct acagcg 26
<210> 256
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 256
tgttttaagg attaaatgat agcacg 26
<210> 257
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 257
caatggtagt tttatgtttc aattca 26
<210> 258
<211> 26
<212> DNA
<213> Artificial Synthesis
<400> 258
aaaaaggact ggacatttgt gctggc 26

Claims (10)

1. The high myopia gene detection kit is characterized in that SNP typing can be simultaneously carried out on 86 gene loci of high myopia related susceptibility genes, wherein the 86 gene loci are as follows: rs1339000, rs10779363, rs783623, rs882879, rs2352096, rs3108476, rs61548163, rs79468746, rs2741293, rs1728288, rs1452186, rs322700, rs117449253, rs147806089, rs61659428, rs73110528, rs11723242, rs13138132, rs36187983, rs139039638, rs2583612, rs9312984, rs256310, rs 8030, rs35590232, rs10458140, rs 858, rs3756772, rs6930157, rs 2969301347, rs 121281, rs 2706948415, rs182270494, rs77003675, rs 7387240, rs 2323124, rs 99242744, rs13230744, rs 90054, rs 787894, rs 656579797979797779777946, rs 4277797779777946, rs 4277567946, rs 42775679775639, rs 42777977569862, rs 427356989, rs 72797756982, rs 72982, rs 42775635989, rs 4277797756989, rs 427356982, rs 427356300, rs 427356989, rs 427356300, rs 7256300, rs 7279437356300, rs 427356300, rs 7256300, rs 7279437356300, rs 72798, rs 7279437356300, rs 7256300, rs 7279437356300, rs 7256300, rs 7279437356300, rs 72798, rs 7279437356300, rs 727945, rs 72798, rs 7279437356300, rs 727945, rs 7279437356300, rs 72798, rs 7279437356300, rs 427356300, rs 7279437356300, rs 727945, rs 7279437356300, rs 727945, rs 7279437356300, rs 437356300, rs 427356300, rs 7279437356300, rs 7279.
2. The high myopia gene detection kit according to claim 1, wherein the high myopia gene detection kit comprises 86 pairs of amplification primers for amplifying 86 gene fragments, and the sequences of the 86 pairs of amplification primers are as follows:
Figure FDA0003285972140000011
Figure FDA0003285972140000021
Figure FDA0003285972140000031
3. the gene detection kit for high myopia according to claim 2, further comprising 86 extension primers for sequentially identifying the 86 gene mutation sites, wherein the sequences of the extension primers are shown as SEQ ID No. 173-SEQ ID No. 258.
4. A high myopia genetic risk assessment system, comprising:
the acquisition module is used for acquiring a DNA sample of a tester;
a detection module comprising the high myopia gene detection kit according to any one of claims 1 to 3, for genotyping the DNA sample;
the prediction module comprises a high myopia genetic risk prediction model used for scoring the high myopia genetic risk of the tester; and
and the evaluation module comprises a high myopia risk grading system for determining the high myopia genetic risk grade of the tester.
5. The high myopia genetic risk assessment system of claim 4, wherein the high myopia genetic risk prediction model integrates risk genomic locus information using genome-wide association analysis and machine learning algorithms to predict and quantitatively score the high myopia genetic risk.
6. The high myopia genetic risk assessment system according to claim 5, wherein the high myopia genetic risk prediction model uses a total of 87 features including gender and gene locus and an intercept term-1.68524541 with a weight of 1, and all features are weighted and summed with an "at risk allelic dose" as the high myopia genetic risk score of the tester, and the feature ID and the feature weight are as follows:
Figure FDA0003285972140000032
Figure FDA0003285972140000041
Figure FDA0003285972140000051
Figure FDA0003285972140000061
7. the high myopia genetic risk assessment system of claim 4, wherein the high myopia risk stratification comprises a total of six levels, level 1: 0-0.1706408, grade 2: 0.1706408-0.2526106, grade 3: 0.2526106-0.3388352, grade 4: 0.3388352-0.4370253, rating 5: 0.4370253-0.5569740, rating 6: 0.5569740-1, 5 intervals of grade 1-5 are left closed and right open, the interval of grade 6 is left closed and right closed, and the genetic risk of high myopia is gradually increased from grade 1 to grade 6.
8. A high myopia genetic risk assessment method is characterized by comprising the following steps:
s1: obtaining a DNA sample of a tester;
s2: genotyping said DNA sample using a high myopia gene detection kit according to any one of claims 1 to 3;
s3: constructing a high myopia genetic risk prediction model, and obtaining a high myopia genetic risk score of a tester by using the high myopia genetic risk prediction model;
s4: and determining individual risk level or overall risk distribution of the crowd or actual high myopia distribution in a specific risk level group according to different testers by using a high myopia risk level classification system.
9. The method for genetic risk assessment of high myopia according to claim 8, wherein the step of constructing the high myopia genetic risk prediction model in step S3 comprises the steps of:
s31: acquiring high myopia phenotype data of the volunteers in a myopia questionnaire mode;
s32: acquiring gene data of volunteers participating in a myopia questionnaire;
s33: preprocessing and set dividing are carried out on collected questionnaire and gene data, and a training set, a verification set and a test set are divided according to a machine learning processing mode;
s34: performing whole genome association analysis on the training set to obtain GWAS statistical data of the high myopia, wherein the GWAS statistical data comprises site numbers and significance P values of the site numbers and the significance P values associated with the high myopia;
s35: combining every two according to GWAS statistical data based on locus linkage correlation r2 and significance P value, and filtering locus sets;
s36: based on each filtered locus set, feature selection is carried out in a self-adaptive mode by using a feature selection algorithm in machine learning, the features filtered by the feature selection algorithm are used, a logistic regression algorithm is applied to a training set, a high myopia genetic risk prediction model is established, and prediction and performance evaluation are carried out on an independent verification set;
s37: based on the predicted performance on the verification set, selecting the model and the optimal gene locus which perform best on the verification set according to the AUC value;
s38: and applying the optimal model to an independent test set for prediction to obtain the prediction performance, risk division capability and population risk distribution of the model in the general population.
10. The high myopia genetic risk assessment method according to claim 9, wherein step S32 includes: and (3) performing genotyping on a high-throughput gene chip based on an Axiom precision medical research array by using a GeneTitan multi-channel instrument platform to acquire the gene data.
CN202111147516.5A 2021-09-29 2021-09-29 High myopia gene detection kit, and high myopia genetic risk assessment system and method Active CN113637742B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111147516.5A CN113637742B (en) 2021-09-29 2021-09-29 High myopia gene detection kit, and high myopia genetic risk assessment system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111147516.5A CN113637742B (en) 2021-09-29 2021-09-29 High myopia gene detection kit, and high myopia genetic risk assessment system and method

Publications (2)

Publication Number Publication Date
CN113637742A true CN113637742A (en) 2021-11-12
CN113637742B CN113637742B (en) 2023-12-01

Family

ID=78426309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111147516.5A Active CN113637742B (en) 2021-09-29 2021-09-29 High myopia gene detection kit, and high myopia genetic risk assessment system and method

Country Status (1)

Country Link
CN (1) CN113637742B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114395620A (en) * 2021-12-20 2022-04-26 温州谱希医学检验实验室有限公司 Biomarker combination for detecting high-myopia susceptible population
CN114891876A (en) * 2022-05-13 2022-08-12 上海谱希和光基因科技有限公司 Functional genome area biomarker combination for diagnosing high myopia
CN116287199A (en) * 2023-02-09 2023-06-23 山东中医药大学附属眼科医院(山东施尔明眼科医院) Primer combination and kit for detecting high myopia risk and application of primer combination and kit

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110129838A1 (en) * 2008-07-28 2011-06-02 Kaohsiung Medical University Using genetic polymorphisms of the bicd1 gene as a method for determining a risk of developing myopia
CN109750103A (en) * 2019-03-11 2019-05-14 深圳乐土生物科技有限公司 High myopia gene detecting kit, chip and high myopia detection method based on flight time mass spectrum
US20200251193A1 (en) * 2018-05-21 2020-08-06 Multimodal Imaging Services Corporation System and method for integrating genotypic information and phenotypic measurements for precision health assessments
CN111893179A (en) * 2020-09-01 2020-11-06 陕西九州医学检验有限公司 Molecular marker of myopia-related susceptibility gene, detection primer set and application thereof
CN112980949A (en) * 2020-12-17 2021-06-18 中山大学 SNP marker for identifying nasopharyngeal carcinoma high risk group, kit and application thereof
CN113403379A (en) * 2021-06-11 2021-09-17 中国科学院北京基因组研究所(国家生物信息中心) Ophthalmologic disease related SNP site primer composition and application
CN115029431A (en) * 2022-06-20 2022-09-09 无锡市疾病预防控制中心 Type 2diabetes gene detection kit and type 2diabetes genetic risk assessment system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110129838A1 (en) * 2008-07-28 2011-06-02 Kaohsiung Medical University Using genetic polymorphisms of the bicd1 gene as a method for determining a risk of developing myopia
US20200251193A1 (en) * 2018-05-21 2020-08-06 Multimodal Imaging Services Corporation System and method for integrating genotypic information and phenotypic measurements for precision health assessments
CN109750103A (en) * 2019-03-11 2019-05-14 深圳乐土生物科技有限公司 High myopia gene detecting kit, chip and high myopia detection method based on flight time mass spectrum
CN111893179A (en) * 2020-09-01 2020-11-06 陕西九州医学检验有限公司 Molecular marker of myopia-related susceptibility gene, detection primer set and application thereof
CN112980949A (en) * 2020-12-17 2021-06-18 中山大学 SNP marker for identifying nasopharyngeal carcinoma high risk group, kit and application thereof
CN113403379A (en) * 2021-06-11 2021-09-17 中国科学院北京基因组研究所(国家生物信息中心) Ophthalmologic disease related SNP site primer composition and application
CN115029431A (en) * 2022-06-20 2022-09-09 无锡市疾病预防控制中心 Type 2diabetes gene detection kit and type 2diabetes genetic risk assessment system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ALFRED POZARICKIJ等: "Quantile regression analysis reveals widespread evidence for gene-environment or gene-gene interactions in myopia development", 《COMMUNICATIONS BIOLOGY》 *
OLIVIER MAUDUIT等: "RCBTB1 Deletion Is Associated with Metastatic Outcome and Contributes to Docetaxel Resistance in Nontranslocation-Related Pleomorphic Sarcomas", 《CANCERS》, vol. 11, no. 1 *
郝文文: "4号染色体上四个SNP位点与高度近视的关联性分析", 《万方数据》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114395620A (en) * 2021-12-20 2022-04-26 温州谱希医学检验实验室有限公司 Biomarker combination for detecting high-myopia susceptible population
CN114891876A (en) * 2022-05-13 2022-08-12 上海谱希和光基因科技有限公司 Functional genome area biomarker combination for diagnosing high myopia
CN116287199A (en) * 2023-02-09 2023-06-23 山东中医药大学附属眼科医院(山东施尔明眼科医院) Primer combination and kit for detecting high myopia risk and application of primer combination and kit

Also Published As

Publication number Publication date
CN113637742B (en) 2023-12-01

Similar Documents

Publication Publication Date Title
Yousefi et al. DNA methylation-based predictors of health: applications and statistical considerations
CN113637742B (en) High myopia gene detection kit, and high myopia genetic risk assessment system and method
Duveau et al. Fitness effects of altering gene expression noise in Saccharomyces cerevisiae
US7803552B2 (en) Biomarkers for predicting prostate cancer progression
CN109182517B (en) Gene for molecular typing of medulloblastoma and application thereof
KR101672531B1 (en) Genetic markers for prognosing or predicting early stage breast cancer and uses thereof
Jendrzejewski et al. Papillary thyroid carcinoma: association between germline DNA variant markers and clinical parameters
EP1569156A2 (en) Method of selecting optimized SNP marker sets from multiple SNP markers associated with a complex disease
EP2665835B1 (en) Prognostic signature for colorectal cancer recurrence
CN110257494A (en) A kind of method, system and augmentation detection system obtaining Chinese population individual age
CN115029431B (en) Type 2 diabetes gene detection kit and type 2 diabetes genetic risk assessment system
CN114220487A (en) Construction method of novel 9-gene RISK acute myelogenous leukemia prognosis model
CN113201590B (en) LncRNA for evaluating early recurrence risk of hepatocellular carcinoma, evaluation method and device
WO2019143845A1 (en) Phenotypic age and dna methylation based biomarkers for life expectancy and morbidity
CN113637741B (en) Early-onset white hair genetic risk gene detection kit, early-onset white hair genetic risk assessment system and early-onset white hair genetic risk assessment method
CN107075586A (en) Glycosyltransferase gene express spectra for identifying kinds cancer type and hypotype
Syndercombe‐Court DNA: current developments and perspectives
RU2769272C1 (en) Method for determining the probability of the eye colour of an individual originating from the populations of russia, and panel of single nucleotide polymorphisms
CN115472294B (en) Model for predicting transformation speed of small cell transformation lung adenocarcinoma patient and construction method thereof
KR102348688B1 (en) SNP markers for diagnosing Cold Hands/Feet Syndrome and use thereof
CN106520988A (en) Application of single nucleotide polymorphism rs76418789 to screening of leprosy patients
Meng et al. Network-based Analysis Approach to Prioritize GWAS of CSF in the ADNI Cohort
Mistry Meta-analyses of expression profiling data in the postmortem human brain
Kolluri Evaluation of performance of MSI detection tools using targeted sequencing data
CN116343902A (en) Method and system for complex disease polygenic genetic risk assessment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant