CN106951730A - A kind of pathogenic grade of genetic mutation determines method and device - Google Patents

A kind of pathogenic grade of genetic mutation determines method and device Download PDF

Info

Publication number
CN106951730A
CN106951730A CN201710170243.3A CN201710170243A CN106951730A CN 106951730 A CN106951730 A CN 106951730A CN 201710170243 A CN201710170243 A CN 201710170243A CN 106951730 A CN106951730 A CN 106951730A
Authority
CN
China
Prior art keywords
single gene
gene inheritance
inheritance disease
data
grade
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710170243.3A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shuo Medical Data Technology (beijing) Co Ltd
Original Assignee
Shuo Medical Data Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shuo Medical Data Technology (beijing) Co Ltd filed Critical Shuo Medical Data Technology (beijing) Co Ltd
Priority to CN201710170243.3A priority Critical patent/CN106951730A/en
Publication of CN106951730A publication Critical patent/CN106951730A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Abstract

Method and device is determined the invention provides a kind of pathogenic grade of genetic mutation, this method includes obtaining structuring single gene inheritance disease data;The first corresponding relation set up between the strength grade of structuring single gene inheritance disease data and structuring single gene inheritance disease data;Determine the strength grade of the corresponding structuring single gene inheritance disease data of genetic mutation to be analyzed;Above-mentioned strength grade and the pathogenic level decisions tree-model of the genetic mutation pre-established are matched, the pathogenic grade of genetic mutation to be analyzed is determined.In the embodiment of the present invention, automatic division of the genetic mutation to the pathogenic row grade of single gene inheritance disease is realized, substantial amounts of time and manpower is saved, and divide accurate.

Description

A kind of pathogenic grade of genetic mutation determines method and device
Technical field
The present invention relates to data analysis and technical field of biological information, caused a disease in particular to a kind of genetic mutation etc. Level determines method and device.
Background technology
Single gene inheritance disease refers to the hereditary disease controlled by a pair of alleles, there are about kind more than 6600, and it is annual with The speed increase of 10-50 kinds, relatively conventional has protanopia anerythrochloropsia, hemophilia, albinism etc., and at present, single gene inheritance disease is to people The health of class has constituted very big threat, therefore, needs to detect single gene inheritance disease in some cases, and , it is necessary to use single gene inheritance disease database during detection single gene inheritance disease.
Existing single gene inheritance disease database has mankind's Mendelian inheritance (0nline Mendelian Inheritance In Man, OMIM), human mutation database (The Human Gene Mutation Database, HGMD) etc., in list Be stored with the evidence studied single gene inheritance disease in gene genetic disease database, and genetic mutation information is to single-gene The information such as pathogenic of hereditary disease, typically can be to database for the ease of carrying out the detection of single gene inheritance disease using database The genetic mutation information of middle storage is classified to the pathogenic of single gene inheritance disease.
In the prior art, when being classified to genetic mutation information to single gene inheritance disease pathogenic, be mostly by According to United States Medicine science of heredity and genomics association (American College Medical Genetics Genomics, ACMG) recommend grade scale the pathogenic of genetic mutation information is classified using manual type, workload very greatly, it is necessary to Substantial amounts of manpower and time are expended, and it is relatively low using manual type progress classification accuracy.
The content of the invention
In view of this, the purpose of the embodiment of the present invention is to provide a kind of pathogenic grade of genetic mutation to determine method and dress Put, to solve to determine that the pathogenic grade of genetic mutation needs to expend substantial amounts of manpower and materials using manual type in the prior art, And accuracy it is very low the problem of.
In a first aspect, method is determined the embodiments of the invention provide a kind of pathogenic grade of genetic mutation, including:
Obtain structuring single gene inheritance disease data;
Set up the structuring single gene inheritance disease data and the structuring single gene inheritance disease data The first corresponding relation between strength grade;
It is determined that the strength grade of structuring single gene inheritance disease data corresponding with genetic mutation to be analyzed;
According to the strength grade and the pathogenic level decisions tree-model of the genetic mutation pre-established, determine described to be analyzed The pathogenic grade of genetic mutation.
With reference in a first aspect, the embodiments of the invention provide the possible implementation of the first of above-mentioned first aspect, its In, the acquisition structuring single gene inheritance disease data, including:
Collect destructuring single gene inheritance disease data;
The destructuring single gene inheritance disease data is converted into the structuring single gene inheritance disease research number According to.
With reference to the first possible implementation of first aspect, the embodiments of the invention provide the of above-mentioned first aspect Two kinds of possible implementations, wherein, it is described that the destructuring single gene inheritance disease data is converted into the structure Change single gene inheritance disease data, including:
Extract the keyword in the destructuring single gene inheritance disease data;
The second corresponding relation of the keyword and the destructuring single gene inheritance disease data is set up, institute is obtained State structuring single gene inheritance disease data.
With reference in a first aspect, the embodiments of the invention provide the possible implementation of the third of above-mentioned first aspect, its In, it is described to set up the strong of the structuring single gene inheritance disease data and the structuring single gene inheritance disease data The first corresponding relation spent between grade, including:
Obtain the keyword in the structuring single gene inheritance disease data;
The keyword is matched with the strength grade criteria for classifying pre-established;
According to matching result, the corresponding strength grade of the structuring single gene inheritance disease data is determined;
First corresponding relation is set up according to the corresponding strength grade of the structuring single gene inheritance disease data.
With reference in a first aspect, the embodiments of the invention provide the possible implementation of the 4th of above-mentioned first aspect kind, its In, the strength grade for determining structuring single gene inheritance disease data corresponding with genetic mutation to be analyzed, including:
Obtain genetic mutation data to be analyzed;
Determine the corresponding structuring single gene inheritance disease data of the genetic mutation data to be analyzed;
According to first corresponding relation, corresponding intensity of the structuring single gene inheritance disease data etc. is determined Level.
With reference in a first aspect, the embodiments of the invention provide the possible implementation of the 5th of above-mentioned first aspect kind, institute State according to the strength grade and the pathogenic level decisions tree-model of the genetic mutation pre-established, determine that the gene to be analyzed becomes Before different pathogenic grade, also include:
Set up the pathogenic level decisions tree-model of genetic mutation.
Second aspect, the embodiments of the invention provide a kind of pathogenic grade determining device of genetic mutation, wherein, described device Including:
Acquisition module, for obtaining structuring single gene inheritance disease data;
Module is set up, for setting up the structuring single gene inheritance disease data and the structuring monogenic inheritance The first corresponding relation between the strength grade of sick data;
First determining module, for determining structuring single gene inheritance disease data corresponding with genetic mutation to be analyzed Strength grade;
Second determining module, for according to the strength grade and the pathogenic level decisions tree mould of the genetic mutation pre-established Type, determines the pathogenic grade of the genetic mutation to be analyzed.
With reference to second aspect, the embodiments of the invention provide the possible implementation of the first of above-mentioned second aspect, its In, the acquisition module includes:
Collector unit, for collecting destructuring single gene inheritance disease data;
Converting unit, for the destructuring single gene inheritance disease data to be converted into the structuring single-gene Hereditary disease data.
With reference to second aspect, the embodiments of the invention provide the possible implementation of second of above-mentioned second aspect, its In, the module of setting up includes:
First acquisition unit, for obtaining the keyword in the structuring single gene inheritance disease data;
Matching unit, for the keyword to be matched with the strength grade criteria for classifying pre-established;
First determining unit, for according to matching result, determining the structuring single gene inheritance disease data correspondence Strength grade;
Unit is set up, described in being set up according to the corresponding strength grade of the structuring single gene inheritance disease data First corresponding relation.
With reference to second aspect, the embodiments of the invention provide the possible implementation of the third of above-mentioned second aspect, its In, first determining module includes:
Second acquisition unit, for obtaining genetic mutation data to be analyzed;
Second determining unit, for according to determining the corresponding structuring single gene inheritance disease of the genetic mutation data to be analyzed Data;
3rd determining unit, for according to first corresponding relation, determining the structuring single gene inheritance disease research The corresponding strength grade of data.
In the pathogenic grade of genetic mutation provided in an embodiment of the present invention determines method and device, genetic mutation pair is realized Automatically determining for the pathogenic grade of single gene inheritance disease, saves substantial amounts of time and manpower, and divide accurate.
To enable the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Brief description of the drawings
Technical scheme in order to illustrate more clearly the embodiments of the present invention, below will be attached to what is used needed for embodiment Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, therefore is not construed as pair The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 shows that the pathogenic grade of genetic mutation that the embodiment of the present invention is provided determines the flow chart of method;
Fig. 2 shows that the pathogenic grade of genetic mutation that the embodiment of the present invention is provided is determined in method, determines that single-gene is lost Pass the flow chart of the corresponding relation between sick data and strength grade;
Fig. 3 shows that the pathogenic grade of genetic mutation that the embodiment of the present invention is provided is determined in method, causes a disease and may cause The decision-tree model schematic diagram of disease;
Fig. 4 shows that the pathogenic grade of genetic mutation that the embodiment of the present invention is provided is determined in method, benign and possible good The decision-tree model schematic diagram of property;
Fig. 5 shows the structural representation for the pathogenic grade determining device of genetic mutation that the embodiment of the present invention is provided;
Fig. 6 shows second of structural representation of the pathogenic grade determining device of genetic mutation that the embodiment of the present invention is provided Figure.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention Middle accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only It is a part of embodiment of the invention, rather than whole embodiments.The present invention generally here described and illustrated in accompanying drawing is real Applying the component of example can be arranged and be designed with a variety of configurations.Therefore, it is of the invention to what is provided in the accompanying drawings below The detailed description of embodiment is not intended to limit the scope of claimed invention, but is merely representative of the selected reality of the present invention Apply example.Based on embodiments of the invention, the institute that those skilled in the art are obtained on the premise of creative work is not made There is other embodiment, belong to the scope of protection of the invention.
In view of in the prior art, when being classified to genetic mutation information to single gene inheritance disease pathogenic, greatly All it is that the grade scale recommended according to ACMG determines genetic mutation information to causing a disease for single gene inheritance disease etc. using manual type Level, workload is very big, it is necessary to expend substantial amounts of manpower and time, and the pathogenic grade accuracy determined using manual type is very It is low.Based on this, method and device is determined the embodiments of the invention provide a kind of pathogenic grade of genetic mutation, below by embodiment It is described.
With reference to shown in Fig. 1, method, including step are determined the embodiments of the invention provide a kind of pathogenic grade of genetic mutation S110-S140, it is specific as follows.
S110, obtains structuring single gene inheritance disease data.
Above-mentioned acquisition structuring single gene inheritance disease data, including:Collect the research of destructuring single gene inheritance disease Data;Destructuring single gene inheritance disease data is converted into structuring single gene inheritance disease data.
Specifically, when collecting destructuring single gene inheritance disease data, mainly from being received following aspects Collection:That has delivered makes a variation data to the related Research Literature of variation information, the result of bioinformatics software prediction, human gene Variation frequency, authoritative database in storehouse are to gene sequencing data of the classification of variation and familial study etc..
Wherein, the form of the above-mentioned destructuring single gene inheritance disease data being collected into can be picture, text etc. Non-structured data, after destructuring single gene inheritance disease data is collected into, lose to above-mentioned destructuring single-gene Pass sick data and carry out structuring conversion, obtain structuring single gene inheritance disease data, structural data is computer The language that can be recognized, is so stored into single gene inheritance disease database when by structuring single gene inheritance disease data Afterwards, it is possible to achieve automatically retrieval, reading and matching to structuring single gene inheritance disease data etc..
It is above-mentioned that non-structural single gene inheritance disease data is converted into structuring single gene inheritance disease data, including Following process:Extract the keyword in destructuring single gene inheritance disease data;Set up above-mentioned keyword and destructuring Second corresponding relation of single gene inheritance disease data, obtains structuring single gene inheritance disease data.
In embodiments of the present invention, ACMG standards are first according to classify to above-mentioned single gene inheritance disease data, The species belonging to above-mentioned single gene inheritance disease data is determined, specifically, the species bag of single gene inheritance disease data Include demographic data, prediction data, performance data, mask data, new hair variation data, allele data, other database numbers According to and eight kinds of other data.
In embodiments of the present invention, above-mentioned demographic data refer to variation frequency of the genetic mutation in crowd research or Record;Above-mentioned prediction data refers to predicting the outcome for influence of the analysis of biological information software to genetic mutation;Above-mentioned functions number According to the molecular function research referred to genetic mutation in vivo or in vitro;Above-mentioned mask data is referred to list The family of gene genetic disease carries out the research whether disease and genetic mutation isolate;Above-mentioned new hair variation data are referred to not Occur monogenic inheritance patient, patient in family with gene genetic disease and carry the new hair variation of discovery and patient father and mother The research of the variation is not carried;Above-mentioned allele data are referred to for single gene inheritance disease in the cis or anti-of variant sites The research of pathogenicity variation is found on formula gene;Above-mentioned other database datas refer to research or the database (ratio of authority As said, HGMD, Clinvar etc.) to the pathogenic classification results of genetic mutation;Above-mentioned other data refer to grinding for other side Study carefully.
In embodiments of the present invention, it can use based on semantic keyword extraction (Semantic-based Keyword Extraction, SKE) algorithm extract destructuring single gene inheritance disease data in keyword.
Specifically, above-mentioned monogenic inheritance data includes genetic mutation information, the corresponding single base of each genetic mutation Because of hereditary disease and the incidence of the single gene inheritance disease.
S120, sets up said structure single gene inheritance disease data and structuring single gene inheritance disease data The first corresponding relation between strength grade.
Different single gene inheritance disease datas correspond to different strength grades, and above-mentioned strength grade refers to single base Because hereditary disease data pair determines the size of the pathogenic influence power of genetic mutation.
It is above-mentioned to set up structuring single gene inheritance disease data and structuring single gene inheritance disease is ground with reference to shown in Fig. 2 Study carefully the corresponding relation between the strength grade of data, including step S210-S230, it is specific as follows:
S210, obtains the keyword in said structure single gene inheritance disease data;
S220, above-mentioned keyword is matched with the strength grade criteria for classifying pre-established;
S230, the corresponding strength grade of said structure single gene inheritance disease data is determined according to matching result;
S240, the first correspondence pass is set up according to the corresponding strength grade of said structure single gene inheritance disease data System.
Specifically, above-mentioned single gene inheritance disease data can be divided into pathogenic single gene inheritance disease data and The single gene inheritance disease data of benign variation, and pathogenic single gene inheritance disease data is divided into following four etc. Level:Pathogenic very strong (pathogenic very strong, PVS), pathogenic strong (pathogenic strong, PS), cause Characteristic of disease medium (pathogenic noderate, PM) and pathogenic support (pathogenic supporting, PP);And it is benign Single gene inheritance disease data be divided into following three grade again:Benign variation independent (benign stand-alone, BA), (benign supporting, BP) is supported in benign variation strong (benign strong, BS) and benign variation.
The variation that above-mentioned PVS refers to is null mutation, and the afunction for the place gene that makes a variation is that related single-gene is lost Pass the mechanism of causing a disease of disease.Specifically, above-mentioned null mutation includes nonsense mutation, frameshift variant, positioned at the base of splice site+1/2 The variation of position or -1/2 base positions, initiation codon variation, single or multiple Exon deletions, these variation influence genes Transcription and translation process, it is impossible to produce normally functioning gene outcome, so referred to as null mutation.
Wherein, above-mentioned PS includes PS1、PS2、PS3And PS4, PS1Refer to nucleotide variation trigger amino acid change and before The a certain pathogenic variation of definition is consistent, PS2Refer to the new hair variation detected in the patient without family history, PS3Refer to body Interior or external functional study confirms variation to gene and the detrimental effect of gene outcome, PS4Refer to variation in impacted Incidence in body is significantly higher than the incidence in control crowd.
Above-mentioned PM includes PM1、PM2、PM3、PM4、PM5And PM6, PM1Refer to that genetic mutation is located at mutantional hotspot, key area Domain or the functional areas by checking, and there is no benign variation, PM in the region2Refer to sequencing of extron group project (Exome Sequencing Project, ESP), thousand human genome plans (1000Genomes Project) and extron group set joint The frequency for not making a variation or making a variation in database displaying control crowds such as (Exome Aggregation Consortium, ExAC) Rate is extremely low, PM3Refer to for latent disease, disease cause mutation, PM are found on trans gene4Refer to causing length protein The variation of change, specifically includes the in-frame deletion for betiding non-duplicate area and insertion, terminates and loses variation, PM5Refer to new hair Another missense mutation occurred at existing missense mutation, and the amino acid residue has been considered as pathogenic, PM6Refer to assuming It is new hair variation, but does not carry out the checking of father and mother.
Above-mentioned PP includes PP1、PP2、PP3、PP4And PP5, PP1Refer to the variation and disease in impacted family member Disease is presented and isolated, and gene where the variation is considered as to cause a disease, PP2Refer to missense mutation, benign missense variation The frequency occurred in the gene is relatively low, and missense mutation is considered as to cause a common mechanism of disease, PP3Refer to Multinomial calculating data show that the variation can produce adverse effect, PP to gene or gene outcome4Refer to patient phenotype or Family's medical history has high degree of specificity, PP to the disease of a single inherent cause5Refer to more authoritative research and database It is pathogenic to support the variation, but lacks the evidence of laboratory independent evaluations.
Above-mentioned BA refers to showing the data such as subgroup sequencing project, thousand human genome plans or extron group set joint outside The gene frequency of this in storehouse is more than 5%.
Above-mentioned BS includes BS1、BS2、BS3And BS4,BS1Refer to that gene frequency is more than the expected incidence of disease, BS2Refer to that it is recessive, dominant, X- linkage inheritances to work as disease, and when showing complete penetrance in the young stage, still can be in health Adult in detect homozygosis, heterozygosis and hemizygous mutation, BS3Refer to that inner or in vitro functional test is had shown to albumen work( Energy or montage are without adverse effect, BS4Refer to lacking separation in the morbidity member of a family.
Above-mentioned BP includes BP1、BP2、BP3、BP4、BP5、BP6And BP7, BP1Refer to missense mutation, it is known that the gene Central Plains The truncated mutant of hair can cause morbidity, BP2The dominant gene or disease for complete penetrance are referred to, on trans gene It was found that pathogenicity variation, BP3The in-frame deletion occurred in the duplicate block for referring to Unknown Function or insertion, BP4Refer to multinomial meter Analytical evidence is calculated to show on gene or gene outcome without influence, BP5Refer to depositing in the variation found in a case, the disease In alternative molecular mechanism, BP6It is benign to refer to more authoritative research and database evaluation variation, but the evidence can not use examination Test carry out independent evaluations, BP7Refer to one synonymous (or silence) be mutated, montage algorithm predict its on montage result without influence, New splice site will not be produced and the nucleotides is not highly conserved.
The above-mentioned strength grade criteria for classifying describes the description of the single gene inheritance disease data of above-mentioned each rank.
Very strong correspondence PVS, by force corresponding PS in above-mentioned pathogenic strength grade1-PS4, medium level correspondence PM1-PM6, branch Hold correspondence PP1-PP5;Corroboration correspondence BA in above-mentioned benign strength grade, corresponds to by force BS1-BS4, support correspondence BP1- BP7
Therefore, when above-mentioned keyword is matched with the strength grade criteria for classifying pre-established, first according to above-mentioned Keywords matching goes out the corresponding rank of structuring single gene inheritance disease data, further according to intensity corresponding to the rank etc. Level determines the corresponding strength grade of structuring single gene inheritance disease data.
Wherein, the above-mentioned criteria for classifying is pre-established, i.e., determine gene using method provided in an embodiment of the present invention Variation it is pathogenic before, the above-mentioned criteria for classifying is just had been set up, specifically, pair between above-mentioned strength grade and keyword It should be related to that the standard that can recommend according to ACMG is set up.
S130, it is determined that the strength grade of structuring single gene inheritance disease data corresponding with genetic mutation to be analyzed.
In embodiments of the present invention, it would be desirable to which the genetic mutation of analysis is designated as genetic mutation to be analyzed, it is first determined go out to treat The corresponding structuring single gene inheritance disease data of genetic mutation is analyzed, further according to structuring single gene inheritance disease data Corresponding relation between strength grade, determines the corresponding structuring single gene inheritance disease data of genetic mutation to be analyzed Corresponding strength grade, is specifically included:
Obtain genetic mutation data to be analyzed;Determine the corresponding structuring single gene inheritance disease research of genetic mutation to be analyzed Data;According to the first corresponding relation, the corresponding strength grade of said structure single gene inheritance disease data is determined.
In embodiments of the present invention, above-mentioned genetic mutation data to be analyzed include variation to be analyzed where chromosome, treat Analytical variance chromosome starting physical location and terminate base sequence before variation of physical location, variation to be analyzed and treat The base sequence of analytical variance after variation etc..
Above-mentioned genetic mutation data to be analyzed are ground with each structuring single gene inheritance disease in above-mentioned first corresponding relation Study carefully data to be matched, determine the corresponding structuring single gene inheritance disease data of above-mentioned genetic mutation data to be analyzed, So further according to said structure single gene inheritance disease data and the first corresponding relation of strength grade, above-mentioned treat is determined Analyze the strength grade of the corresponding structuring single gene inheritance disease data of genetic mutation data.
S140, causes a disease level decisions tree-model according to above-mentioned strength grade and the genetic mutation that pre-establishes, it is determined that treating point Analyse the pathogenic grade of genetic mutation.
Above-mentioned pathogenic grade is including causing a disease, may cause a disease, benign, possible benign and uncertain meaning.
Above-mentioned cause a disease refers to that some genetic mutation can cause corresponding single gene inheritance disease;It is above-mentioned benign to refer to some Variation will not cause corresponding single gene inheritance disease;Above-mentioned may cause a disease refers to that some variation causes corresponding monogenic inheritance The possibility of disease is more than 90%;It is above-mentioned benign may refer to that some variation will not cause corresponding single gene inheritance disease can Energy property is more than 90%;Above-mentioned uncertain meaning refers to some genetic mutation and corresponding single gene inheritance disease onset relation not It is determined that.
Wherein, the pathogenic level decisions tree-model of said gene variation is pre-established, i.e., using present invention implementation Example provide the pathogenic determination method of genetic mutation or it is determined that, it is necessary to set up before the pathogenic grade of genetic mutation to be analyzed The pathogenic level decisions tree-model of genetic mutation, specifically, said gene variation is caused a disease, level decisions tree-model is pushed away according to ACMG What the pathogenic grade scale of genetic mutation recommended was set up, process includes:
The pathogenic grade scale of genetic mutation recommended according to ACMG determines the pathogenic level decisions tree-model of genetic mutation Split vertexes, according to the split vertexes determined, set up genetic mutation and cause a disease level decisions tree-models.
In embodiments of the present invention, the strength grade of single gene inheritance disease data is the non-leaf segment of decision-tree model Point, the quantity of the corresponding single gene inheritance disease data of each strength grade is the branch of decision-tree model, grade of causing a disease knot Fruit is leaf node, determines the nonleaf node of decision-tree model successively according to the size of strength grade, for causing a disease and may cause a disease Decision-tree model, as shown in figure 3, strength grade very strong (very strong) is as the root node of decision-tree model, by force (strong) as child node very strong on decision-tree model, medium (moderate) is saved as son strong on decision-tree model Point, and (supporting) is supported as child node medium on decision-tree model, what is finally drawn causing a disease or may cause a disease Result as the leaf node of decision-making tree-shaped, numeral in the branch of decision-tree model is the corresponding single-gene of the strength grade The quantity of hereditary disease data.
For benign and possible benign decision-tree model, as shown in figure 4, strength grade independent (stand-alone) is this The root node of decision-tree model, strong (strong) is child node independent on the decision-tree model, supports (supporting) to be Strong child node on the decision-tree model, the benign or possible benign leaf node as decision-tree model finally drawn.
Specifically, the pathogenic grade scale of genetic mutation that ACMG recommends is as follows:
Cause a disease corresponding criterion be:The single gene inheritance disease data and at least one PS of 1 PVS grade1-PS4 The single gene inheritance disease data of grade;Or the single gene inheritance disease data and at least two PM of 1 PVS grade1- PM6The single gene inheritance disease data of grade;Or single gene inheritance disease data, 1 PM of 1 PVS grade1-PM6 The single gene inheritance disease data of grade and 1 PP1-PP5The single gene inheritance disease data of grade;Or 1 PVS etc. The single gene inheritance disease data and at least two PP of level1-PP5The single gene inheritance disease data of grade;Or at least 2 Individual PS1-PS4The single gene inheritance disease data of grade;Or 1 PS1-PS4The single gene inheritance disease data of grade and At least three PM1-PM6The single gene inheritance disease data of grade;Or 1 PS1-PS4The single gene inheritance disease research of grade Data, 2 PM1-PM6The single gene inheritance disease data and at least two PP of grade1-PP5The single gene inheritance disease of grade is ground Study carefully data;Or 1 PS1-PS4The single gene inheritance disease data of grade, 1 PM1-PM6The single gene inheritance disease of grade is ground Study carefully data and at least four PP1-PP5The single gene inheritance disease data of grade.
The corresponding criterion that may cause a disease is:The single gene inheritance disease data and 1 PM of 1 PVS grade1-PM6 The single gene inheritance disease data of grade;Or 1 PS1-PS4The single gene inheritance disease data of grade and 1 or 2 PM1-PM6The single gene inheritance disease data of grade;Or 1 PS1-PS4The single gene inheritance disease data of grade and extremely Few 2 PP1-PP5The single gene inheritance disease data of grade;Or at least three PM1-PM6The single gene inheritance disease of grade is ground Study carefully data;Or 2 PM1-PM6The single gene inheritance disease data of grade and 2 PP1-PP5The single gene inheritance disease of grade Data;Or 1 PM1-PM6The single gene inheritance disease data and at least four PP of grade1-PP5The single-gene of grade Hereditary disease data.
Benign corresponding criterion is:The single gene inheritance disease data or at least two BS of 1 BA grade1-BS4 The single gene inheritance disease data of grade.
May benign corresponding criterion be:1 BS1-BS4The single gene inheritance disease data of grade and 1 BP1- BP7The single gene inheritance disease data of grade;Or at least two BP1-BP7The single gene inheritance disease data of grade.
Pathogenic indefinite corresponding criterion is:Above-mentioned criterion is not met;Or it is benign pathogenic corresponding Single gene inheritance disease data and pathogenic corresponding single gene inheritance disease data are conflicting.
The pathogenic grade of genetic mutation provided in an embodiment of the present invention determines method, realizes genetic mutation to monogenic inheritance Automatically determining for the pathogenic grade of disease, saves substantial amounts of time and manpower, and divide accurate.
With reference to shown in Fig. 5, the embodiment of the present invention additionally provides a kind of pathogenic grade determining device of genetic mutation, and the device is used In performing the pathogenic determination method of genetic mutation provided in an embodiment of the present invention, the device includes acquisition module 310, sets up module 320th, the first determining module 330 and the second determining module 340;
Above-mentioned acquisition module 310, for obtaining structuring single gene inheritance disease data;
It is above-mentioned to set up module 320, for setting up said structure single gene inheritance disease data and structuring single-gene The first corresponding relation between the strength grade of hereditary disease data;
Above-mentioned first determining module 330, for determining structuring single gene inheritance disease corresponding with genetic mutation to be analyzed The strength grade of data;
Above-mentioned second determining module 340, for according to above-mentioned strength grade and the pathogenic grade of the genetic mutation pre-established Decision-tree model, determines the pathogenic grade of genetic mutation to be analyzed.
Specifically, it is logical that the acquisition module 310 in the embodiment of the present invention, which obtains structuring single gene inheritance disease data, Cross what collector unit and converting unit were realized, specifically include:
Above-mentioned collector unit, for collecting destructuring single gene inheritance disease data;Above-mentioned converting unit, for inciting somebody to action Destructuring single gene inheritance disease data is converted to structuring single gene inheritance disease data.
Wherein, with reference to shown in Fig. 6, it is above-mentioned set up module 320 set up said structure single gene inheritance disease data and Corresponding relation between strength grade is by first acquisition unit 321, the determining unit 323 of matching unit 322 and first and is built What vertical unit 324 was realized, specifically include:
Above-mentioned first acquisition unit 321, for obtaining the keyword in said structure single gene inheritance disease data; Above-mentioned matching unit 322, for above-mentioned keyword to be matched with the strength grade criteria for classifying pre-established;Above-mentioned first Determining unit 323, for according to matching result, determining the corresponding strength grade of structuring single gene inheritance disease data;On State and set up unit 324, for setting up above-mentioned according to the corresponding strength grade of said structure single gene inheritance disease data One corresponding relation.
Above-mentioned first determining module 330 determines the corresponding structuring single gene inheritance disease data of genetic mutation to be analyzed Strength grade be by second acquisition unit, the second determining unit and the 3rd determining unit realize, specifically include:
Above-mentioned second acquisition unit, for obtaining genetic mutation data to be analyzed;Above-mentioned second determining unit, for determining The corresponding structuring single gene inheritance disease data of genetic mutation data to be analyzed;Above-mentioned 3rd determining unit, for basis Above-mentioned first corresponding relation, determines the corresponding strength grade of structuring single gene inheritance disease data.
The pathogenic grade determining device of genetic mutation provided in an embodiment of the present invention, realizes genetic mutation to monogenic inheritance Automatically determining for the pathogenic grade of disease, saves substantial amounts of time and manpower, and divide accurate.
Genetic mutation that the embodiment of the present invention is provided cause a disease grade determining device can for the specific hardware in equipment or Person is installed on software or firmware in equipment etc..The technology of the device that the embodiment of the present invention is provided, its realization principle and generation Effect is identical with preceding method embodiment, to briefly describe, and device embodiment part does not refer to part, refers to preceding method real Apply corresponding contents in example.It is apparent to those skilled in the art that, it is for convenience and simplicity of description, described above System, the specific work process of device and unit, may be referred to the corresponding process in above method embodiment, herein no longer Repeat.
, can be by others side in embodiment provided by the present invention, it should be understood that disclosed apparatus and method Formula is realized.Device embodiment described above is only schematical, for example, the division of the unit, only one kind are patrolled Collect function to divide, there can be other dividing mode when actually realizing, but for example, multiple units or component can combine or can To be integrated into another system, or some features can be ignored, or not perform.It is another, it is shown or discussed each other Coupling or direct-coupling or communication connection can be the INDIRECT COUPLING or communication link of device or unit by some communication interfaces Connect, can be electrical, machinery or other forms.
The unit illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional unit in the embodiment that the present invention is provided can be integrated in a processing unit, also may be used To be that unit is individually physically present, can also two or more units it is integrated in a unit.
If the function is realized using in the form of SFU software functional unit and is used as independent production marketing or in use, can be with It is stored in a computer read/write memory medium.Understood based on such, technical scheme is substantially in other words The part contributed to prior art or the part of the technical scheme can be embodied in the form of software product, the meter Calculation machine software product is stored in a storage medium, including some instructions are make it that a computer equipment (can be individual People's computer, server, or network equipment etc.) perform all or part of step of each of the invention embodiment methods described. And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then it further need not be defined and explained in subsequent accompanying drawing, in addition, term " the One ", " second ", " the 3rd " etc. are only used for distinguishing description, and it is not intended that indicating or implying relative importance.
Finally it should be noted that:Embodiment described above, is only the embodiment of the present invention, to illustrate the present invention Technical scheme, rather than its limitations, protection scope of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair It is bright to be described in detail, it will be understood by those within the art that:Any one skilled in the art The invention discloses technical scope in, it can still modify to the technical scheme described in previous embodiment or can be light Change is readily conceivable that, or equivalent is carried out to which part technical characteristic;And these modifications, change or replacement, do not make The essence of appropriate technical solution departs from the spirit and scope of embodiment of the present invention technical scheme.The protection in the present invention should all be covered Within the scope of.Therefore, protection scope of the present invention described should be defined by scope of the claims.

Claims (10)

1. a kind of genetic mutation is caused a disease, grade determines method, it is characterised in that methods described includes:
Obtain structuring single gene inheritance disease data;
Set up the intensity of the structuring single gene inheritance disease data and the structuring single gene inheritance disease data The first corresponding relation between grade;
It is determined that the strength grade of structuring single gene inheritance disease data corresponding with genetic mutation to be analyzed;
According to the strength grade and the pathogenic level decisions tree-model of the genetic mutation pre-established, the gene to be analyzed is determined The pathogenic grade of variation.
2. according to the method described in claim 1, it is characterised in that the acquisition structuring single gene inheritance disease data, Including:
Collect destructuring single gene inheritance disease data;
The destructuring single gene inheritance disease data is converted into the structuring single gene inheritance disease data.
3. method according to claim 2, it is characterised in that described that the destructuring single gene inheritance disease is studied into number According to being converted to the structuring single gene inheritance disease data, including:
Extract the keyword in the destructuring single gene inheritance disease data;
The second corresponding relation of the keyword and the destructuring single gene inheritance disease data is set up, the knot is obtained Structure single gene inheritance disease data.
4. according to the method described in claim 1, it is characterised in that described to set up the structuring single gene inheritance disease research number According to the first corresponding relation between the strength grade of the structuring single gene inheritance disease data, including:
Obtain the keyword in the structuring single gene inheritance disease data;
The keyword is matched with the strength grade criteria for classifying pre-established;
According to matching result, the corresponding strength grade of the structuring single gene inheritance disease data is determined;
First corresponding relation is set up according to the corresponding strength grade of the structuring single gene inheritance disease data.
5. according to the method described in claim 1, it is characterised in that described to determine structuring corresponding with genetic mutation to be analyzed The strength grade of single gene inheritance disease data, including:
Obtain genetic mutation data to be analyzed;
Determine the corresponding structuring single gene inheritance disease data of the genetic mutation data to be analyzed;
According to first corresponding relation, the corresponding strength grade of the structuring single gene inheritance disease data is determined.
6. according to the method described in claim 1, it is characterised in that described according to the strength grade and the gene pre-established Before the pathogenic level decisions tree-model of variation, the pathogenic grade for determining the genetic mutation to be analyzed, also include:
Set up the pathogenic level decisions tree-model of genetic mutation.
The grade determining device 7. a kind of genetic mutation is caused a disease, it is characterised in that described device includes:
Acquisition module, for obtaining structuring single gene inheritance disease data;
Module is set up, is ground for setting up the structuring single gene inheritance disease data and the structuring single gene inheritance disease Study carefully the first corresponding relation between the strength grade of data;
First determining module, for determining the strong of structuring single gene inheritance disease data corresponding with genetic mutation to be analyzed Spend grade;
Second determining module, for being caused a disease level decisions tree-model according to the strength grade and the genetic mutation that pre-establishes, Determine the pathogenic grade of the genetic mutation to be analyzed.
8. device according to claim 7, it is characterised in that the acquisition module includes:
Collector unit, for collecting destructuring single gene inheritance disease data;
Converting unit, for the destructuring single gene inheritance disease data to be converted into the structuring monogenic inheritance Sick data.
9. device according to claim 7, it is characterised in that the module of setting up includes:
First acquisition unit, for obtaining the keyword in the structuring single gene inheritance disease data;
Matching unit, for the keyword to be matched with the strength grade criteria for classifying pre-established;
First determining unit, for according to matching result, determining that the structuring single gene inheritance disease data is corresponding strong Spend grade;
Unit is set up, for setting up described first according to the corresponding strength grade of the structuring single gene inheritance disease data Corresponding relation.
10. device according to claim 7, it is characterised in that first determining module includes:
Second acquisition unit, for obtaining genetic mutation data to be analyzed;
Second determining unit, for determining the corresponding structuring single gene inheritance disease research number of the genetic mutation data to be analyzed According to;
3rd determining unit, for according to first corresponding relation, determining the structuring single gene inheritance disease data Corresponding strength grade.
CN201710170243.3A 2017-03-21 2017-03-21 A kind of pathogenic grade of genetic mutation determines method and device Pending CN106951730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710170243.3A CN106951730A (en) 2017-03-21 2017-03-21 A kind of pathogenic grade of genetic mutation determines method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710170243.3A CN106951730A (en) 2017-03-21 2017-03-21 A kind of pathogenic grade of genetic mutation determines method and device

Publications (1)

Publication Number Publication Date
CN106951730A true CN106951730A (en) 2017-07-14

Family

ID=59472194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710170243.3A Pending CN106951730A (en) 2017-03-21 2017-03-21 A kind of pathogenic grade of genetic mutation determines method and device

Country Status (1)

Country Link
CN (1) CN106951730A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086571A (en) * 2018-08-03 2018-12-25 国家卫生计生委科学技术研究所 A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported
CN109920481A (en) * 2019-01-31 2019-06-21 北京诺禾致源科技股份有限公司 The genetic mutation unscrambling data library BRCA1/2 and its construction method
CN110800062A (en) * 2017-10-16 2020-02-14 因美纳有限公司 Deep convolutional neural network for variant classification
CN114429785A (en) * 2022-04-01 2022-05-03 普瑞基准生物医药(苏州)有限公司 Automatic classification method and device for genetic variation and electronic equipment
CN114496072A (en) * 2022-01-17 2022-05-13 北京安琪尔基因医学科技有限公司 Deafness pathogenic analysis grade classification method and device, computer readable storage medium and server

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009042686A1 (en) * 2007-09-27 2009-04-02 Perlegen Sciences, Inc. Methods for genetic analysis
CN101617227A (en) * 2006-11-30 2009-12-30 纳维哲尼克斯公司 Genetic analysis systems and method
US20160140288A1 (en) * 2014-11-19 2016-05-19 TCI Gene, Inc. Method for forming personal nutrition complex according to incidence of disease and genetic polymorphism by a prediction system
CN106202936A (en) * 2016-07-13 2016-12-07 为朔医学数据科技(北京)有限公司 A kind of disease risks Forecasting Methodology and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101617227A (en) * 2006-11-30 2009-12-30 纳维哲尼克斯公司 Genetic analysis systems and method
WO2009042686A1 (en) * 2007-09-27 2009-04-02 Perlegen Sciences, Inc. Methods for genetic analysis
US20160140288A1 (en) * 2014-11-19 2016-05-19 TCI Gene, Inc. Method for forming personal nutrition complex according to incidence of disease and genetic polymorphism by a prediction system
CN106202936A (en) * 2016-07-13 2016-12-07 为朔医学数据科技(北京)有限公司 A kind of disease risks Forecasting Methodology and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SUE RICHARDS, PHD ET AL ;: "《Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology》", 《GENETICS IN MEDICINE》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110800062A (en) * 2017-10-16 2020-02-14 因美纳有限公司 Deep convolutional neural network for variant classification
CN109086571A (en) * 2018-08-03 2018-12-25 国家卫生计生委科学技术研究所 A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported
CN109086571B (en) * 2018-08-03 2019-08-23 国家卫生健康委科学技术研究所 A kind of method and system that monogenic disease hereditary variation is intelligently interpreted and reported
CN109920481A (en) * 2019-01-31 2019-06-21 北京诺禾致源科技股份有限公司 The genetic mutation unscrambling data library BRCA1/2 and its construction method
CN114496072A (en) * 2022-01-17 2022-05-13 北京安琪尔基因医学科技有限公司 Deafness pathogenic analysis grade classification method and device, computer readable storage medium and server
CN114429785A (en) * 2022-04-01 2022-05-03 普瑞基准生物医药(苏州)有限公司 Automatic classification method and device for genetic variation and electronic equipment
CN114429785B (en) * 2022-04-01 2022-07-19 普瑞基准生物医药(苏州)有限公司 Automatic classification method and device for genetic variation and electronic equipment

Similar Documents

Publication Publication Date Title
CN106951730A (en) A kind of pathogenic grade of genetic mutation determines method and device
Ji et al. TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis
AU2016272732B2 (en) Discovering population structure from patterns of identity-by-descent
KR101950395B1 (en) Method for deep learning-based biomarker discovery with conversion data of genome sequences
KR20200065000A (en) Systems and methods for leveraging relevance in genomic data analysis
CN111883210B (en) Single-gene disease name recommendation method and system based on clinical features and sequence variation
CN110211630A (en) The screening apparatus and storage medium and processor of pathogenic uniparental disomy
WO2021062198A1 (en) Single cell rna-seq data processing
Sharmila et al. An artificial immune system-based algorithm for abnormal pattern in medical domain
KR102085169B1 (en) Analysis system for personalized medicine based personal genome map and Analysis method using thereof
CN117219166A (en) Screening method, system and equipment for highly myopic pathological changes
KR102041504B1 (en) Personalized medicine analysis platform for patient stratification
Bonenfant et al. Porechop_ABI: discovering unknown adapters in ONT sequencing reads for downstream trimming
WO2020135500A1 (en) Method and system for constructing biological information analysis reference data set
CN109754843B (en) Method and device for detecting insertion deletion of small genome fragment
KR102041497B1 (en) Analysis platform for personalized medicine based personal genome map and Analysis method using thereof
CN108710781B (en) Sequencing method and device for genetic mutation
CN108509767B (en) Method and device for processing genetic mutation
CN106951533A (en) A kind of Research on Genetic Variation date storage method and device
CN117312893B (en) Evaluation method and related device for flora matching degree
KR102110017B1 (en) miRNA ANALYSIS SYSTEM BASED ON DISTRIBUTED PROCESSING
WO2024021037A1 (en) Disease analysis method and apparatus, and disease analysis model training method and apparatus
Park Segmentation-free inference of cell types from in situ transcriptomics data
CN115600091B (en) Classification model recommendation method and device based on multi-modal feature fusion
Iltanen et al. Clustering and summarising association rules mined from phenotype, genotype and environmental data concerning age-related hearing impairment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170714

RJ01 Rejection of invention patent application after publication