CN110246544A - A kind of biomarker selection method and system based on confluence analysis - Google Patents
A kind of biomarker selection method and system based on confluence analysis Download PDFInfo
- Publication number
- CN110246544A CN110246544A CN201910409758.3A CN201910409758A CN110246544A CN 110246544 A CN110246544 A CN 110246544A CN 201910409758 A CN201910409758 A CN 201910409758A CN 110246544 A CN110246544 A CN 110246544A
- Authority
- CN
- China
- Prior art keywords
- gene
- importance
- algorithm
- data
- analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
Abstract
The invention discloses a kind of biomarker selection method and system based on confluence analysis, this method include the following steps: to choose raw sequencing data;Raw sequencing data uses FANSe algorithm, carries out mapping analysis, obtains gene quantification information, sets gene original packet;Importance ranking of the gene in original packet is calculated using GWGS algorithm, then is integrated the importance of every group of gene using GWRS algorithm, the gene importance after being integrated arranges list, and gene is sorted from high to low according to importance;Data mining is carried out using the Wrapper Feature Selection model based on SVM, data sample type is distinguished, filters out biomarker in the gene high from importance.The present invention is according to sequencing data feature, the polycentric raw sequencing data of organic combination, and platform, sample, the systematical difference in experimental design are solved, depth data excavation is carried out using the high confluence analysis algorithm of robustness, is excavated to common, special, crucial large biological molecule.
Description
Technical field
The present invention relates to biomarker detection technique fields, and in particular to a kind of biomarker based on confluence analysis
Selection method and system.
Background technique
Find more common and high specificity, key strong large biological molecule (including nucleic acid and protein), Ke Yiti
Therapeutic treatment effect is risen, but existing molecular marker is difficult to meet common, special, crucial requirement, molecular marker is big
Mostly analyze to obtain using multicenter data, and the conventional treatment mode of existing multicenter data (is assembled using Meta analysis
Analysis), the conclusion of multicenter study is integrated, since multicenter data are commonly present experiment object disparity, instrumental method difference etc.
Inconsistent factor is not added and respectively merges that the method that its initial data is analyzed is not appropriate, and meta-analysis is vulnerable to original number
Bias is caused according to the influence of the factors such as quality, original researcher's analysis level, original research tool mistakes and omissions, so that a large amount of precious
Your data fails to be fully used.
Summary of the invention
In order to overcome the shortcomings of the prior art, the present invention provides a kind of biomarker selection based on confluence analysis
Method and system establish a kind of confluence analysis strategy, have the whole of strong robustness using high-precision bottom layer treatment algorithm development
Hop algorithm directly carries out confluence analysis to multicenter raw sequencing data, to make full use of multicenter magnanimity sequencing data, digs
Dig common, special, crucial large biological molecule.
In order to achieve the above object, the invention adopts the following technical scheme:
The present invention provides a kind of biomarker selection method based on confluence analysis, includes the following steps:
S1: raw sequencing data is chosen;
S2: raw sequencing data uses FANSe algorithm, carries out mapping analysis, obtains gene quantification information, sets base
Because of original packet;
S3: importance ranking of the gene in original packet is calculated using GWGS algorithm, then using GWRS algorithm by every group
The importance of gene is integrated, and the gene importance after being integrated arranges list, and gene is sorted from high to low according to importance;
S4: data mining is carried out using the Wrapper Feature Selection model based on SVM, distinguishes data sample
This type filters out biomarker in the gene high from importance.
Raw sequencing data described in step S1 as a preferred technical solution, using what is generated from sequencing machine
The sequencing file of fastq format.
Carry out mapping analysis, specific steps described in step S2 as a preferred technical solution, are as follows:
Short reading sequence is broken into multiple nonoverlapping seeds, each seed degree is identical, by all seeds and refers to base
Because group is matched, statistics marking is carried out according to initiation site to the seed matched, is ranked according to score height, according to
Coordination interception refers to gene order, short reading sequence is compared with intercepting with reference to genome sequence, by the highest order in comparison
Short reading sequence location obtains gene quantification information as final position.
It is important in original packet that gene is calculated using GWGS algorithm described in step S3 as a preferred technical solution,
Property sequence, first using GWRS algorithm to mapping analysis after sequencing data evaluate and test, according to expression significance degree assign
Give different numerical value, the specific formula for calculation that GWRS algorithm is evaluated and tested are as follows:
Wherein, rijIndicate the rank value of the i-th gene in jth microarray, i ∈ (1, m), j ∈ (1, n), sijFor GWRS
Value, to containing the gene of NA, s in microarrayijValue is also set as NA.
The importance of every group of gene is integrated using GWRS algorithm again in step S3 as a preferred technical solution, specifically
Calculation formula are as follows:
Wherein, ωjIndicate the weighted value of jth microarray, sijFor GWRS value.
The Wrapper Feature Selection based on SVM is used described in step S4 as a preferred technical solution,
Model carries out data mining, specific steps are as follows:
S41: Wrapper Feature Selection model, training Wrapper Feature are established based on SVM
Selection model;
S42: trained Wrapper Feature Selection will be input to according to the good genome of importance ranking
Model judges to export whether result can separate specimen types, reaches preset condition, export corresponding gene, not up to default
Condition, loop-around data mining process is carried out, gradually adds gene until reaching preset condition, the corresponding base of output final result
Cause.
The present invention provides a kind of biomarker selection system based on confluence analysis, comprising: raw sequencing data is chosen
Module and data-mining module are integrated in module, quantitative analysis module, sequence;
The raw sequencing data chooses module for choosing raw sequencing data, chooses fastq lattice from sequencing machine
The sequencing file of formula;
The quantitative analysis module carries out mapping analysis using FANSe algorithm to raw sequencing data, and it is fixed to obtain gene
Measure information;
The sequence integrates module for generating the arrangement list of gene importance, calculates gene original using GWGS algorithm
Importance ranking in grouping, then integrated the importance of every group of gene using GWRS algorithm, the gene after being integrated is important
Property arrangement list, gene is sorted from high to low according to importance;
The data-mining module is for filtering out biomarker, using the Wrapper Feature based on SVM
Selection model carries out data mining, distinguishes data sample type, filters out biological marker in the gene high from importance
Object.
Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) present invention establishes sequencing data confluence analysis strategy, according to sequencing data feature, the polycentric original of organic combination
Beginning sequencing data, and solve platform, sample, the systematical difference in experimental design, using the high confluence analysis algorithm of robustness into
Row depth data excavates, and excavates to common, special, crucial large biological molecule.
Detailed description of the invention
Fig. 1 is the flow diagram of biomarker selection method of the present embodiment based on confluence analysis.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
Embodiment
The biomarker selection method based on confluence analysis that the present embodiment provides a kind of, raw sequencing data is utilized
FANSe serial algorithm mapping and it is quantitative after, first calculate importance of the gene in certain single data set using GWGS algorithm,
Again by the multiple data sets of GWRS Algorithms Integration, importance ranking of the gene in all data sets is obtained, is arranged according to importance
Sequence is that gene is gradually put into screening model by sequence, finally selects biomarker.
The present embodiment introduces high-precision sequencing analysis algorithm FANSe, FANSe algorithm and is based on Hash seed matching progress sequence
Compare, can efficiently, high accurancy and precision by short reading sequence alignment into reference genome, algorithm accuracy is high, serious forgiveness pole
By force, by the graceful algorithm of Smith-water to micro- insertion/micro-deleted extremely sensitive, while result has reliable experimental verification.
The present embodiment sequencing data amount is needed by a large amount of pre-processing, such as the step that mapping calculation amount is very big,
And the step for precision will have a direct impact on Integrative analysis accuracy.
As shown in Figure 1, the biomarker selection method provided in this embodiment based on confluence analysis, specific steps are such as
Under:
S1: raw sequencing data is chosen;
S2: raw sequencing data obtains accurate quantitative result after carrying out mapping analysis, obtains gene quantification information, if
Determine gene original packet:
Raw sequencing data is the sequencing file of fastq format directly generated from sequencing machine, this document need with
The reference sequences of corresponding species compare, and thus calculating in sequencing sample has what gene (qualitative part), the table of each gene
Up to amount be how many (dosing section).Mapping analytical calculation process are as follows: short reading sequence is broken into several nonoverlapping seeds,
Each seed degree is identical, and all seeds are matched with reference genome, are united to the seed matched according to initiation site
Meter, marking, the higher ranking of score is more forward, refers to gene order according to coordination interception, and short reading sequence is referred to base with interception
It because group sequence is precisely compared, is relatively given a mark according to base-base, by returning for the wherein graceful algorithm of Smith-water
Mechanism of tracing back is cancelled, and acceleration purpose is had reached, and comparison result is arranged, using the short reading sequence location of the highest order in precise alignment as most
Final position is set, that is, gene has been determined, completes mapping overall process.Then according to the sequence quantity on mapping, quantitative gene table
Up to amount.Algorithm has robustness and serious forgiveness extremely strong by evaluation, therefore handles downloading again from different realities with this algorithm
The data of platform are tested, experiment porch or different experiments bring experimental data bias can be removed or reduce;
S3: importance ranking of the gene in original packet is calculated using GWGS algorithm, then using GWRS algorithm by every group
The importance of gene is integrated, and the gene importance after being integrated arranges list, and gene is sorted from high to low according to importance:
First using the GWRS algorithm as shown in formula (1) to being commented in the processed single centre sequencing data of FANSe
It surveys, different numerical value is assigned according to the significance degree of expression,
Wherein, rijIndicate the rank value of the i-th gene in jth microarray, i ∈ (1, m), j ∈ (1, n), sijFor GWRS
Value, to containing the gene of NA, s in microarrayijValue is also set as NA;
Confluence analysis is carried out to above-mentioned GWRS result using GWGS algorithm shown in formula (2), one group is generated and crosses in mostly
The gene expression data of calculation evidence:
Wherein, ωjIndicate the weighted value of jth microarray;
S4: data mining is carried out using the Wrapper Feature Selection model based on SVM, distinguishes data sample
This type filters out biomarker in the gene high from importance;
In the present embodiment, model based on support vector machines (SVM) based on establishing, at step S2, step S3
That managed sequences the genome of importance, is gradually added in circulation model, i.e., increased a gene than last time every time, and put into
Into trained Wrapper Feature Selection model in advance, judge to export whether result meets optimal stabilization
Whether accuracy can really separate specimen types, if reaching best stabilized accuracy, that is, jump out circulation and output reaches this
As a result corresponding gene, if not up to best accuracy will be as a result, detection will be carried out persistently, gradually addition gene is until reaching
Until optimum.Above step can accurately filter out both important from the gene importance list that step S2, S3 generates
Property gene in the top and can accurately distinguishing sample type is as marker.
In the present embodiment, Wrapper Feature Selection model training method is Training, that is, is known
Whether known sample answer, the gene for detecting investment can separate the sample of different phase, and the present embodiment is with random sampling
What the mode of 1000 sample datas was groped is best suitable for the corresponding suitable parameter of the data type, i.e., related under this parameter
Gene can distinguish sample and reach highest accuracy.
In the present embodiment, for improved model adapt to sequencing data, the relevant sequencing data of the present embodiment application sample into
The adjustment of row model and preliminary experiment, according to data characteristics to GWRS, SVM etc. in GWGS and Wrapper feature selection
Module is adjusted, while fully taking into account computational efficiency optimization, parallelization calculating and the problems such as distributed computing.
In the present embodiment, model needs to carry out appropriate adjustment according to the different of clinical sample:
1. needing to introduce FANSe serial algorithm for sequencing data to guarantee that quantitative result is sequenced, quantified in good sequencing
As a result upper that screening could be unfolded;
The characteristics of 2.GWRS and GWGS have also contemplated sequencing data cannot such as only rely on and quantitatively make by means of mono- difference of P value
For parameter, may need to introduce it is multiple, based on the present embodiment uses fold differences, weight of the P value as fold differences
To give gene importance ranking;
3. a pair sequencing data is sampled, according to its feature, the sieve of Wrapper Feature Selection model is formulated
Parameter is selected, guarantee obtains highest stable accuracy.
In the present embodiment, clinical sequencing data is screened from multiple databases, according to the step mentioned in technical solution
Suddenly, first by all data through FANSe serial algorithm mapping and quantitative Treatment, after obtaining gene quantification information, with original number
It is unit according to grouping, calculates importance ranking of the gene in original packet using GWGS algorithm, reapplying GWRS algorithm will be every
The importance integration of group gene, one group of gene importance after being integrated arrange list.On earth according to importance height by gene
Sequence, screens large biological molecule (i.e. from important gene using the Wrapper Feature Selection model based on SVM
Biomarker), by the calculating and screening to this batch data, filter out common, special, crucial large biological molecule.
The present embodiment also provides a kind of biomarker selection system based on confluence analysis, comprising: raw sequencing data
Module, quantitative analysis module are chosen, sorts and integrates module and data-mining module;
The raw sequencing data chooses module for choosing raw sequencing data, chooses fastq lattice from sequencing machine
The sequencing file of formula;
The quantitative analysis module carries out mapping analysis using FANSe algorithm to raw sequencing data, and it is fixed to obtain gene
Measure information;
The sequence integrates module for obtaining the arrangement list of gene importance, calculates gene original using GWGS algorithm
Importance ranking in grouping, then integrated the importance of every group of gene using GWRS algorithm, the gene after being integrated is important
Property arrangement list, gene is sorted from high to low according to importance;
The data-mining module is for filtering out biomarker, using the Wrapper Feature based on SVM
Selection model carries out data mining, distinguishes data sample type, filters out biological marker in the gene high from importance
Object.
The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment
Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention,
It should be equivalent substitute mode, be included within the scope of the present invention.
Claims (7)
1. a kind of biomarker selection method based on confluence analysis, which is characterized in that include the following steps:
S1: raw sequencing data is chosen;
S2: raw sequencing data uses FANSe algorithm, carries out mapping analysis, obtains gene quantification information, and setting gene is former
Begin to be grouped;
S3: importance ranking of the gene in original packet is calculated using GWGS algorithm, then uses GWRS algorithm by every group of gene
Importance integration, after integrate gene importance arrangement list, gene is sorted from high to low according to importance;
S4: data mining is carried out using the Wrapper Feature Selection model based on SVM, distinguishes data sample class
Type filters out biomarker in the gene high from importance.
2. the biomarker selection method according to claim 1 based on confluence analysis, which is characterized in that in step S1
The raw sequencing data, using the sequencing file of the fastq format generated from sequencing machine.
3. the biomarker selection method according to claim 1 based on confluence analysis, which is characterized in that in step S2
The carry out mapping analysis, specific steps are as follows:
Short reading sequence is broken into multiple nonoverlapping seeds, each seed degree is identical, by all seeds and refers to genome
It is matched, statistics marking is carried out according to initiation site to the seed matched, ranked according to score height, according to coordination
Interception refers to gene order, short reading sequence is compared with intercepting with reference to genome sequence, by the short reading of highest order in comparison
Sequence location obtains gene quantification information as final position.
4. the biomarker selection method according to claim 1 based on confluence analysis, which is characterized in that in step S3
It is described that importance ranking of the gene in original packet is calculated using GWGS algorithm, first using GWRS algorithm to mapping points
Sequencing data after analysis is evaluated and tested, and different numerical value, the tool that GWRS algorithm is evaluated and tested are assigned according to the significance degree of expression
Body calculation formula are as follows:
Wherein, rijIndicate the rank value of the i-th gene in jth microarray, i ∈ (1, m), j ∈ (1, n), sijFor GWRS value, to micro-
Contain the gene of NA, s in arrayijValue is also set as NA.
5. the biomarker selection method according to claim 1 based on confluence analysis, which is characterized in that in step S3
The importance of every group of gene is integrated using GWRS algorithm again, specific formula for calculation are as follows:
Wherein, ωjIndicate the weighted value of jth microarray, sijFor GWRS value.
6. the biomarker selection method according to claim 1 based on confluence analysis, which is characterized in that in step S4
It is described that data mining, specific steps are carried out using the Wrapper Feature Selection model based on SVM are as follows:
S41: Wrapper Feature Selection model, training Wrapper Feature are established based on SVM
Selection model;
S42: trained Wrapper Feature Selection mould will be input to according to the good genome of importance ranking
Type judges to export whether result can separate specimen types, reaches preset condition, export corresponding gene, not up to presets item
Part, loop-around data mining process is carried out, gradually adds gene until reaching preset condition, the corresponding base of output final result
Cause.
7. a kind of biomarker based on confluence analysis selects system characterized by comprising raw sequencing data chooses mould
Module and data-mining module are integrated in block, quantitative analysis module, sequence;
The raw sequencing data chooses module for choosing raw sequencing data, chooses fastq format from sequencing machine
File is sequenced;
The quantitative analysis module carries out mapping analysis using FANSe algorithm to raw sequencing data, obtains gene quantification letter
Breath;
The sequence integrates module for generating the arrangement list of gene importance, calculates gene in original packet using GWGS algorithm
In importance ranking, then the importance of every group of gene is integrated using GWRS algorithm, the gene importance after integrate is arranged
Column list sorts gene according to importance from high to low;
The data-mining module is for filtering out biomarker, using the Wrapper Feature based on SVM
Selection model carries out data mining, distinguishes data sample type, filters out biological marker in the gene high from importance
Object.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910409758.3A CN110246544B (en) | 2019-05-17 | 2019-05-17 | Biomarker selection method and system based on integration analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910409758.3A CN110246544B (en) | 2019-05-17 | 2019-05-17 | Biomarker selection method and system based on integration analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110246544A true CN110246544A (en) | 2019-09-17 |
CN110246544B CN110246544B (en) | 2021-03-19 |
Family
ID=67884226
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910409758.3A Active CN110246544B (en) | 2019-05-17 | 2019-05-17 | Biomarker selection method and system based on integration analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110246544B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112686580A (en) * | 2021-01-31 | 2021-04-20 | 重庆渝高科技产业(集团)股份有限公司 | Workflow definition method and system capable of customizing flow |
CN114574582A (en) * | 2022-03-21 | 2022-06-03 | 暨南大学 | Transcriptomic standard and preparation method thereof |
CN116543838A (en) * | 2023-07-05 | 2023-08-04 | 苏州凌点生物技术有限公司 | Data analysis method for biological gene selection expression probability |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050079524A1 (en) * | 2000-01-21 | 2005-04-14 | Shaw Sandy C. | Method for identifying biomarkers using Fractal Genomics Modeling |
CN104968802A (en) * | 2012-11-16 | 2015-10-07 | 西门子公司 | Novel miRNAs as diagnostic markers |
CN105874080A (en) * | 2013-09-09 | 2016-08-17 | 阿尔玛克诊断有限公司 | Molecular diagnostic test for oesophageal cancer |
CN105874079A (en) * | 2013-09-09 | 2016-08-17 | 阿尔玛克诊断有限公司 | Molecular diagnostic test for lung cancer |
CN106845152A (en) * | 2017-02-04 | 2017-06-13 | 北京林业大学 | A kind of genome cytimidine site apparent gene type classifying method |
CN109642256A (en) * | 2016-07-28 | 2019-04-16 | 阿利瑟迪亚格公司 | Rna editing as the biomarker tested for emotional handicap |
CN109658980A (en) * | 2018-03-20 | 2019-04-19 | 上海交通大学医学院附属瑞金医院 | A kind of screening and application of excrement gene marker |
-
2019
- 2019-05-17 CN CN201910409758.3A patent/CN110246544B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050079524A1 (en) * | 2000-01-21 | 2005-04-14 | Shaw Sandy C. | Method for identifying biomarkers using Fractal Genomics Modeling |
CN104968802A (en) * | 2012-11-16 | 2015-10-07 | 西门子公司 | Novel miRNAs as diagnostic markers |
CN105874080A (en) * | 2013-09-09 | 2016-08-17 | 阿尔玛克诊断有限公司 | Molecular diagnostic test for oesophageal cancer |
CN105874079A (en) * | 2013-09-09 | 2016-08-17 | 阿尔玛克诊断有限公司 | Molecular diagnostic test for lung cancer |
CN109642256A (en) * | 2016-07-28 | 2019-04-16 | 阿利瑟迪亚格公司 | Rna editing as the biomarker tested for emotional handicap |
CN106845152A (en) * | 2017-02-04 | 2017-06-13 | 北京林业大学 | A kind of genome cytimidine site apparent gene type classifying method |
CN109658980A (en) * | 2018-03-20 | 2019-04-19 | 上海交通大学医学院附属瑞金医院 | A kind of screening and application of excrement gene marker |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112686580A (en) * | 2021-01-31 | 2021-04-20 | 重庆渝高科技产业(集团)股份有限公司 | Workflow definition method and system capable of customizing flow |
CN112686580B (en) * | 2021-01-31 | 2023-05-16 | 重庆渝高科技产业(集团)股份有限公司 | Workflow definition method and system capable of customizing flow |
CN114574582A (en) * | 2022-03-21 | 2022-06-03 | 暨南大学 | Transcriptomic standard and preparation method thereof |
CN116543838A (en) * | 2023-07-05 | 2023-08-04 | 苏州凌点生物技术有限公司 | Data analysis method for biological gene selection expression probability |
CN116543838B (en) * | 2023-07-05 | 2023-09-05 | 苏州凌点生物技术有限公司 | Data analysis method for biological gene selection expression probability |
Also Published As
Publication number | Publication date |
---|---|
CN110246544B (en) | 2021-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Way et al. | Predicting cell health phenotypes using image-based morphology profiling | |
CN110246544A (en) | A kind of biomarker selection method and system based on confluence analysis | |
Duffy et al. | Early phase drug discovery: cheminformatics and computational techniques in identifying lead series | |
CN110767266B (en) | Graph convolution-based scoring function construction method facing ErbB targeted protein family | |
CN106021984A (en) | Whole-exome sequencing data analysis system | |
Wu et al. | Bayesian selection of nucleotide substitution models and their site assignments | |
Adebali et al. | Phylogenetic analysis of SARS-CoV-2 genomes in Turkey | |
CN107066835A (en) | A kind of utilization common data resource discovering and method and system and the application for integrating rectum cancer associated gene and its functional analysis | |
CN102884203A (en) | Query sequence genotype or subtype classification method | |
Melquiond et al. | Next challenges in protein–protein docking: from proteome to interactome and beyond | |
Liu et al. | Strong partitioning of soil bacterial community composition and co-occurrence networks along a small-scale elevational gradient on Zijin Mountain | |
Yang et al. | Detecting recent positive selection with a single locus test bipartitioning the coalescent tree | |
Liu et al. | A comparison of topologically associating domain callers based on Hi-C data | |
CN101110095B (en) | Method for batch detecting susceptibility gene of common brain disease | |
Zhong et al. | G4Bank: a database of experimentally identified DNA G-quadruplex sequences | |
Lyu et al. | High-resolution conodont unitary association zonations (UAZs) across the Induan-Olenekian boundary (Lower Triassic): A global correlation | |
CN111898807B (en) | Tobacco leaf yield prediction method based on whole genome selection and application | |
CN110310706A (en) | A kind of protein is without mark absolute quantification method | |
Sammeth et al. | Global multiple‐sequence alignment with repeats | |
CN107665290A (en) | A kind of method and apparatus of data processing | |
CN106701979A (en) | Kit used for mycobacterium tuberculosis typing SNP site and application thereof | |
CN111243661A (en) | Gene physical examination system based on gene data | |
CN112397140A (en) | Target identification method and device based on allosteric mechanism and storage medium | |
Vohradsky et al. | Proteome of Caulobacter crescentus cell cycle publicly accessible on SWICZ server | |
Imam et al. | A comprehensive overview on application of bioinformatics and computational statistics in rice genomics toward an Amalgamated approach for improving acquaintance base |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |