CN108763862B - Method for deducing gene pathway activity - Google Patents

Method for deducing gene pathway activity Download PDF

Info

Publication number
CN108763862B
CN108763862B CN201810422205.7A CN201810422205A CN108763862B CN 108763862 B CN108763862 B CN 108763862B CN 201810422205 A CN201810422205 A CN 201810422205A CN 108763862 B CN108763862 B CN 108763862B
Authority
CN
China
Prior art keywords
gene
genes
sample
expression value
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810422205.7A
Other languages
Chinese (zh)
Other versions
CN108763862A (en
Inventor
刘文斌
沈良忠
昝乡镇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wenzhou University
Original Assignee
Wenzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wenzhou University filed Critical Wenzhou University
Priority to CN201810422205.7A priority Critical patent/CN108763862B/en
Publication of CN108763862A publication Critical patent/CN108763862A/en
Application granted granted Critical
Publication of CN108763862B publication Critical patent/CN108763862B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a method for deducing gene pathway activity, which comprises the steps of obtaining a sample, a pathway network corresponding to the sample and expression values of all genes, and carrying out weighting treatment on the expression values of all genes by taking a t value of a gene t test and a Pearson correlation coefficient as weights of the genes; obtaining interaction types and corresponding intensities among genes in the same channel according to the topological structure of the channel network, and obtaining interaction expression values among the genes by using the intensities of the interaction types among the genes and the weighted expression values of the genes; the expression values and the interaction expression values of the respective genes were integrated and analyzed by principal component analysis, and the obtained first principal components were each further defined as the activity score of the pathway. The invention is implemented, and the importance of genes and the importance of the interaction between the genes are simultaneously considered to infer the activity of the pathway, thereby realizing the evaluation of the state of the biological pathway sample.

Description

Method for deducing gene pathway activity
Technical Field
The invention relates to the technical field of gene detection, in particular to a method for deducing gene pathway activity.
Background
Many recent research methods propose to search more robust biological markers at a functional level to break through the problem of instability of single gene tags. Because genes are not solely involved in biological processes, gene products usually act synergistically in the modes of functional modules or signal cascades and the like, functional modules which are disordered at a high level are possibly more stable than single genes as biomarkers, and various noises have little influence on the biomarkers. The biological markers at the functional level can effectively reduce the heterogeneity of tissues and the genetic heterogeneity of samples, and simultaneously effectively analyze the relationship between important functional pathways and diseases. Therefore, integrating the expression profiles of the functionally related genes and extracting the classification features at the functional level will be beneficial to obtain more robust biological markers. Functional modules are often embedded in classical pathways and protein networks, and these high-throughput information can be obtained from Gene Ontology, KEGG databases, or other Gene sets defined in microarray expression profiling research experiments, such as the molecular signature database MSigDB.
Since the pathway information highly reflects the chemical effect and functional expression between genes, the expression level of genes in the pathway is indistinguishable from the function embodied by the pathway, and once the expression level of significant genes in the pathway is disturbed, the function of part of the pathway is also disordered. Therefore, a classification identification experiment is performed by analyzing gene expression profiles in the pathway to define the activity of the pathway, so as to obtain accurate biomarkers. For example, in order to solve the problem of gene duplication in different paths, researchers such as Su design a log-likelihood function to search for a linear sub-path with classification capability, the obtained linear sub-path has higher classification capability, and the classification effect is further improved; in another example, Breslin et al investigators infer pathway activity by the sum of pathway member gene expression values; as another example, Guo et al investigators infer pathway activity by calculating the Mean (Mean) or Median (media) of pathway member gene expression values; for another example, researchers such as Bild and the like can deduce the pathway activity by analyzing a pathway member gene expression profile through a main component and using a first main component, and the method can also identify a disordered pathway pattern and an oncogenic pathway marker, thereby providing an important basis for the targeted treatment of related cancer subtypes; for another example, Lee et al have suggested that CORGs (condition-responsive genes) genes in a pathway play a major role in pathway activity rather than all genes in the pathway. The above research results indicate that considering the functional modules of genes can identify more stable biological markers and obtain more accurate classification effect.
However, the above method for inferring pathway activity only utilizes significant genes in a pathway, does not consider interaction information between genes, but only considers the pathway as a simple set of single genes, but ignores gene topology information in a pathway network, and loses many important information of intergenic communication.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a method for deducing the activity of a gene pathway, and to infer the activity of the pathway by considering the importance of genes and the importance of interactions between genes, thereby realizing the evaluation of the state of a biological pathway sample.
In order to solve the above technical problems, an embodiment of the present invention provides a method for deriving gene pathway activity, comprising the steps of:
step S1, obtaining a sample and a corresponding path network thereof, obtaining the expression value of each gene contained in the path network, and weighting the expression value of each gene in the path network by taking the t value of the t test of the expression value of the gene between two different phenotypes and the Pearson correlation coefficient of the gene expression value and the sample phenotype as the weight of the gene;
step S2, obtaining the interaction type and the corresponding strength of the genes in the same channel according to the topological structure of the channel network, and obtaining the interaction expression value of each gene in the channel network by using the strength corresponding to the interaction type of the genes and the expression value of each gene after weighting treatment;
and step S3, integrating the expression values of all genes in the pathway network and the interaction expression values among all genes, analyzing by adopting a principal component analysis method, and further defining the obtained first principal components as the activity scores of the pathways.
Wherein, in the step S1, the expression values of the genes included in the path network are normalized by the formula
Figure BDA0001651025570000031
Wherein, gij represents the expression value of the gene i in the sample j, and mean and std represent the mean and standard deviation of the expression value of the gene in all samples, respectively.
Wherein in the step S1, the expression value of the gene after weight processing is z'ij=tscore(gi)2*ρ(gi)*zij(ii) a Wherein, z'ijGene g in sample jijA weighted expression value; gene tscore(gi) Is gene giAnalyzing the statistic value of the gene expression value between two phenotypes by using a two-tailed t test; ρ (g)i) Is the Pearson correlation coefficient between the expression value of the gene in all samples and the sample phenotype.
Wherein, in the step S2, the interaction expression value between the genes is
Figure BDA0001651025570000032
Wherein e ishjIs gene gijAnd gene gkjAn expression value of the interaction; beta is aikIs gene giAnd gene gkA beta value corresponding to the interaction type; rhoikIs gene giAnd gene gkPearson's correlation coefficient of expression value; z'ijGene g in sample jijThe expression value after weighting; z'kjGene g in sample jkjThe expression value after weighting.
Wherein, in the step S3, the calculation formula of the activity score of each gene pathway is:
a(Pj)=w1jz′1j+w2jz'2j+…+wijz′ij+…+wnjz'nj+w(n+1)je1j+…+w(n+h)ehj+…w(n+l)elj(ii) a Wherein, a (P)j) Is the pathway activity fraction, w, of sample j1jIs the weight of the first gene in the sample j in the first principal component, wijIs the weight of the gene i in the sample j in the first principal component, w(n+1)jThe weight of the first principal component for the interaction between the first genes in sample j, n being the baseThe total number of genes, l is the number of interactions between genes.
The embodiment of the invention has the following beneficial effects:
the invention adopts the principal component analysis method to analyze the expression value of each gene and the interaction expression value among each gene in the channel network integrating each sample, and defines the first principal component obtained by each sample as the activity score of the channel, thereby not only considering the importance of the genes, but also considering the importance of the interaction among the genes to infer the activity of the channel and having wide practicability.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is within the scope of the present invention for those skilled in the art to obtain other drawings based on the drawings without inventive exercise.
FIG. 1 is a flow chart of a method for inferring gene pathway activity provided in an embodiment of the present invention;
FIG. 2 is a diagram illustrating an application scenario of the method for deriving gene pathway activity according to the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings.
Referring to FIG. 1, a method for deriving the activity of a gene pathway is provided in the examples of the present invention, which comprises the following steps:
step S1, obtaining a sample and a corresponding path network thereof, obtaining the expression value of each gene contained in the path network, and weighting the expression value of each gene in the path network by taking the t value of the t test of the expression value of the gene between two different phenotypes and the Pearson correlation coefficient of the gene expression value and the sample phenotype as the weight of the gene;
step S2, obtaining the interaction type and the corresponding strength of the genes in the same channel according to the topological structure of the channel network, and obtaining the interaction expression value of each gene in the channel network by using the strength corresponding to the interaction type of the genes and the expression value of each gene after weighting treatment;
and step S3, integrating the expression values of all genes in the pathway network and the interaction expression values among all genes, analyzing by adopting a principal component analysis method, and further defining the obtained first principal components as the activity scores of the pathways.
Specifically, in step S1, in order to make the gene expression values in the same class, it is avoided that the gene expression values are not in the same dimension, and an unreasonable classification result is obtained. Firstly, the expression values of all genes contained in the access network are standardized, and the specific formula is as follows:
Figure BDA0001651025570000041
in the formula (1), gijRepresents the expression value of the gene i in the sample j, and mean and std represent the mean and standard deviation of the expression value of the gene in all samples, respectively. If the expression value of a certain gene is missing in sample j, the average value of the expression values of the gene in other samples is used as a filling deletion value.
Since the expression difference of the gene in the two phenotypes can be visually depicted by the t-test value, if the t-test value of the gene is higher, the expression difference of the gene in the two phenotypes is more obvious, so that the gene expression value can be weighted by using the characteristic of the t-test value, and the gene expression value difference of the gene in different phenotypes is amplified.
The expression values of the genes after the treatment of each sample weight are:
z'ij=tscore(gi)2*ρ(gi)*zij (2);
in formula (2), z'ijGene g in sample jijA weighted expression value; gene tscore(gi) Is gene giAnalyzing the statistic value of the gene expression value between two phenotypes by using a two-tailed t test; ρ (g)i) Is the Pearson correlation coefficient between the expression value of the gene in all samples and the sample phenotype.
It should be noted that the t-test can be divided into single population test and double population test, and the single population t-test mainly tests the difference between the average number of one sample and the average number of population samples. See if this difference is significant. The statistics for the single sample t-test are:
Figure BDA0001651025570000051
wherein
Figure BDA0001651025570000052
Is the average of the population samples, n is the number of population samples of the sample, σXIs the standard deviation of the sample.
The double population t-test measures the difference between two samples at the level of the respective population. The double global t-test can be subdivided into an independent sample t-test and a paired sample t-test. An independent sample t-test is commonly used in cancer classification experiments. The difference in gene expression between two different phenotypes is described by t-test values for the gene between the two different phenotypes. T-test statistics for its gene at two different phenotypes were:
Figure BDA0001651025570000053
wherein n is1And n2The total number of positive and negative samples respectively,
Figure BDA0001651025570000054
and
Figure BDA0001651025570000055
the variance of the gene expression values in the two samples,
Figure BDA0001651025570000061
and
Figure BDA0001651025570000062
the mean value of the gene expression values in the two samples is shown. Zero assumes that the mean and variance of the positive-too distribution obeyed by both samples are the same. This method is usually called student t-test only if the variances of the two populations are equal. When this null assumption does not hold, the method is sometimes referred to as Welch's t-test. the t-test can also be used to test the difference between two measurements of the same statistic to determine if the difference between them is zero, in which case the test is often referred to as a "paired" or "duplicate measurement" t-test.
It should be noted that the pearson correlation coefficient is often used to characterize the correlation between gene expression values and sample phenotypes and the correlation between two genes, where there is an interaction between the two genes, and the pearson correlation coefficient can be used to visually describe the strength of the interaction between the two genes. The formula for calculating the Pearson correlation coefficient of the interacting gene i and gene k is:
Figure BDA0001651025570000063
the value of the Pearson correlation coefficient is between 1 and-1, and the Pearson correlation coefficient of two genes is 1, which shows that the two genes are completely positively correlated and have strong correlation; when the Pearson correlation coefficient of the two genes is 0, the two variables have no linear correlation and the correlation is weak; when the Pearson correlation coefficient of two variables is-1, the two genes are completely negatively linearly related, and the strong correlation between the two genes can be also shown.
The pearson correlation coefficients are symmetric, i.e.: corr (X, Y) ═ cor (Y, X). One key mathematical property of the pearson correlation coefficient is: it is invariant under different variations in the position and scale of the two variables. That is, we can transform X to a + bX and Y to c + dY, where a, b, c, and d are constants b and d > 0, and this change in the variables does not change the correlation coefficient between them.
In step S2, if there is an interaction relationship between gene i and gene k in the pathway, the expression value of the interaction between the two genes can be defined based on the expression values of the two genes. The gene interactions are weighted by the strength and type of interaction between them. Thus, the interaction expression between gene i and gene k is expressed as:
Figure BDA0001651025570000064
in the formula (3), ehjIs gene gijAnd gene gkjAn expression value of the interaction; beta is aikIs gene giAnd gene gkA beta value corresponding to the interaction type; rhoikIs gene giAnd gene gkPearson's correlation coefficient of expression value; z'ijGene g in sample jijThe expression value after weighting; z'kjGene g in sample jkjThe expression value after weighting.
By analogy, the value of the expression of the interaction between the genes in the pathway network can be determined.
In step S3, the activity score of each gene pathway is calculated by the formula:
a(Pj)=w1jz′1j+w2jz'2j+…+wijz′ij+…+wnjz'nj+w(n+1)je1j+…+w(n+h)ehj+…w(n+l)elj (4);
in the formula (4), a (P)j) Is the pathway activity fraction, w, of sample j1jIs the weight of the first gene in the sample j in the first principal component, wijIs the weight of the gene i in the sample j in the first principal component, w(n+1)jIs the weight of the first intergenic interaction in the sample j in the first principal component, n is the total number of genes, and l is the number of intergenic interactions.
It should be noted that Principal Component Analysis (PCA) is an important feature dimension reduction algorithm in machine learning, and the basic principle thereof is to project original data onto the dimension of the feature vector of the covariance matrix.
The algorithm for PCA roughly comprises the following steps:
1: carrying out standardization treatment on all sample data, namely mean value normalization;
2: calculating a covariance matrix C of the sample data:
Figure BDA0001651025570000071
where m is the number of samples and n is the amount of data per sample;
3: performing singular value decomposition on the covariance matrix obtained in the previous step:
[U,S,V]=svd(C) (4-2)
4: then setting a projection feature matrix P according to the feature vector corresponding to the feature value;
5: projecting the original data onto a feature matrix to:
Z=PTX (4-3)
the PCA technique is commonly used in various research fields, and its name varies from field to field, for example, it is called noise and vibration spectrum analysis in structural dynamics, empirical mode analysis. In the machine learning process classification problem, feature selection process is often performed, and in the classification experiment, in the case of limited number of samples, tens of thousands of genes are obviously not desirable to be classified as features, which greatly reduces the performance of the classifier. Dimension reduction processing of biological data is a feasible method. The gene data after the dimensionality reduction of the PCA technology reserves the information of the original data, wherein the variance of the first principal component data is the largest and is often used for selecting as an important classification characteristic.
Fig. 2 is a diagram illustrating an application scenario of the method for deriving the activity of a genetic pathway according to the embodiment of the present invention. First, the gene expression values are normalized. Secondly, establishing gene interaction based on gene expression value data and a path; in a pathway network, each node represents a gene, and each edge represents an interaction relationship between two genes; third, a pathway activity score was calculated for each sample using principal component analysis.
The embodiment of the invention has the following beneficial effects:
the invention adopts the principal component analysis method to analyze the expression value of each gene and the interaction expression value among each gene in the channel network integrating each sample, and defines the first principal component obtained by each sample as the activity score of the channel, thereby not only considering the importance of the genes, but also considering the importance of the interaction among the genes to infer the activity of the channel and having wide practicability.
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (2)

1. A method of deriving gene pathway activity comprising the steps of:
step S1, obtaining a sample and a corresponding path network thereof, obtaining the expression value of each gene contained in the path network, and weighting the expression value of each gene in the path network by taking the t value of the t test of the expression value of the gene between two different phenotypes and the Pearson correlation coefficient of the gene expression value and the sample phenotype as the weight of the gene;
step S2, obtaining the interaction type and the corresponding strength of the genes in the same channel according to the topological structure of the channel network, and obtaining the interaction expression value of each gene in the channel network by using the strength corresponding to the interaction type of the genes and the expression value of each gene after weighting treatment;
step S3, integrating the expression values of all genes in the pathway network and the interaction expression values among all genes, analyzing by a principal component analysis method, and further defining the obtained first principal components as the activity scores of the pathways;
in step S1, the expression values of the genes included in the path network are normalized by the formula
Figure FDA0002944029500000011
Wherein, gijRepresenting the expression value of the gene i in the sample j, mean and std respectively represent the average value and standard deviation of the expression value of the gene in all samples;
in step S1, the expression value of the gene after weight processing is z'ij=tscore(gi)2*ρ(gi)*zij(ii) a Wherein, z'ijGene g in sample jijA weighted expression value; gene tscore(gi) Is gene giAnalyzing the statistic value of the gene expression value between two phenotypes by using a two-tailed t test; ρ (g)i) Is the Pearson correlation coefficient between the expression value of the gene in all samples and the sample phenotype;
in the step S2, the interaction expression value between the genes is
Figure FDA0002944029500000012
Wherein e ishjIs gene gijAnd gene gkjAn expression value of the interaction; beta is aikIs gene giAnd gene gkA beta value corresponding to the interaction type; rhoikIs gene giAnd gene gkPearson's correlation coefficient of expression value; z'ijGene g in sample jijThe expression value after weighting; z'kjGene g in sample jkjThe expression value after weighting.
2. The method of deriving gene pathway activity according to claim 1 wherein in step S3, the activity score for each gene pathway is calculated by the formula:
a(Pj)=w1jz’1j+w2jz'2j+… +wijz’ij+… +wnjz'nj+w(n+1)je1j+… +w(n+h)ehj+… w(n+l)elj(ii) a Wherein the content of the first and second substances,
a(Pj) Is the pathway activity fraction, w, of sample j1jIs the weight of the first gene in the sample j in the first principal component, wijIs the weight of the gene i in the sample j in the first principal component, w(n+1)jIs the weight of the first intergenic interaction in the sample j in the first principal component, n is the total number of genes, and l is the number of intergenic interactions.
CN201810422205.7A 2018-05-04 2018-05-04 Method for deducing gene pathway activity Expired - Fee Related CN108763862B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810422205.7A CN108763862B (en) 2018-05-04 2018-05-04 Method for deducing gene pathway activity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810422205.7A CN108763862B (en) 2018-05-04 2018-05-04 Method for deducing gene pathway activity

Publications (2)

Publication Number Publication Date
CN108763862A CN108763862A (en) 2018-11-06
CN108763862B true CN108763862B (en) 2021-06-29

Family

ID=64009157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810422205.7A Expired - Fee Related CN108763862B (en) 2018-05-04 2018-05-04 Method for deducing gene pathway activity

Country Status (1)

Country Link
CN (1) CN108763862B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109817337B (en) * 2019-01-30 2020-09-08 中南大学 Method for evaluating channel activation degree of single disease sample and method for distinguishing similar diseases

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004022711A2 (en) * 2002-09-05 2004-03-18 Bioseek, Inc. Biomap analysis
WO2013044354A1 (en) * 2011-09-26 2013-04-04 Trakadis John Method and system for genetic trait search based on the phenotype and the genome of a human subject
CN103045605A (en) * 2012-12-26 2013-04-17 首都医科大学宣武医院 I type neurofibroma NF1 gene mutation nucleotide sequence related to cerebrovascular stenosis and application thereof
CN103180462A (en) * 2010-10-06 2013-06-26 拜奥默里克斯公司 Method for determining biological pathway activity
CN104094266A (en) * 2011-11-07 2014-10-08 独创***公司 Methods and systems for identification of causal genomic variants
CN105893731A (en) * 2015-01-19 2016-08-24 大道安康(北京)科技发展有限公司 Method for building expression detecting system of genetic health network
CN107133492A (en) * 2017-05-02 2017-09-05 温州大学 A kind of method that gene pathway is recognized based on PAGIS
CN107203704A (en) * 2017-05-02 2017-09-26 温州大学 A kind of method that gene pathway is recognized based on GSA
CN107220526A (en) * 2017-05-02 2017-09-29 温州大学 A kind of method that gene pathway is recognized based on PADOG

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090117562A1 (en) * 2007-04-09 2009-05-07 Valerie Wailin Hu Method and kit for diagnosing Autism using gene expression profiling
WO2013152301A1 (en) * 2012-04-05 2013-10-10 H. Lee Moffitt Cancer Center And Research Institute, Inc. O-glycan pathway ovarian cancer signature

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004022711A2 (en) * 2002-09-05 2004-03-18 Bioseek, Inc. Biomap analysis
CN103180462A (en) * 2010-10-06 2013-06-26 拜奥默里克斯公司 Method for determining biological pathway activity
WO2013044354A1 (en) * 2011-09-26 2013-04-04 Trakadis John Method and system for genetic trait search based on the phenotype and the genome of a human subject
CN104094266A (en) * 2011-11-07 2014-10-08 独创***公司 Methods and systems for identification of causal genomic variants
CN103045605A (en) * 2012-12-26 2013-04-17 首都医科大学宣武医院 I type neurofibroma NF1 gene mutation nucleotide sequence related to cerebrovascular stenosis and application thereof
CN105893731A (en) * 2015-01-19 2016-08-24 大道安康(北京)科技发展有限公司 Method for building expression detecting system of genetic health network
CN107133492A (en) * 2017-05-02 2017-09-05 温州大学 A kind of method that gene pathway is recognized based on PAGIS
CN107203704A (en) * 2017-05-02 2017-09-26 温州大学 A kind of method that gene pathway is recognized based on GSA
CN107220526A (en) * 2017-05-02 2017-09-29 温州大学 A kind of method that gene pathway is recognized based on PADOG

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Pathway Interaction Network Analysis Identifies Dysregulated Pathways in Human Monocytes Infected by Listeria monocytogenes;Wufeng Fan 等;《Computational and Mathematical Methods in Medicine》;20170816;1-8 *
Subpathway Analysis based on Signaling-Pathway Impact Analysis of Signaling Pathway;Xianbin Li 等;《PLoS ONE》;20150724;1-19 *
一种基于基因拓扑重要性的通路识别方法;方宏源 等;《生物信息学》;20171231;第15卷(第4期);214-220 *
注意力缺陷多动障碍和低出生体重共有的生物学通路研究;向波 等;《中华医学遗传学杂志》;20171231;第34卷(第6期);844-848 *

Also Published As

Publication number Publication date
CN108763862A (en) 2018-11-06

Similar Documents

Publication Publication Date Title
Risso et al. A general and flexible method for signal extraction from single-cell RNA-seq data
Liu et al. Comparison of five iterative imputation methods for multivariate classification
Opgen-Rhein et al. Inferring gene dependency networks from genomic longitudinal data: a functional data approach
CN108763864B (en) Method for evaluating state of biological pathway sample
Liu et al. Feature selection based on sensitivity analysis of fuzzy ISODATA
Gao et al. James–Stein shrinkage to improve k-means cluster analysis
Archimbaud et al. ICS for multivariate outlier detection with application to quality control
Mohammed et al. Evaluation of partitioning around medoids algorithm with various distances on microarray data
US20070009160A1 (en) Apparatus and method for removing non-discriminatory indices of an indexed dataset
Momal et al. Tree-based inference of species interaction network from abundance data
Leiva et al. Linear discrimination for multi-level multivariate data with separable means and jointly equicorrelated covariance structure
CN108763862B (en) Method for deducing gene pathway activity
Azim et al. CDSImpute: An ensemble similarity imputation method for single-cell RNA sequence dropouts
Liebl et al. Parameter regimes in partial functional panel regression
Rabier On statistical inference for selective genotyping
Leek Surrogate variable analysis
Liu et al. Similarity measure for sparse time course data based on Gaussian processes
Liu et al. Assessing agreement of clustering methods with gene expression microarray data
Bellio et al. Practical use of modified maximum likelihoods for stratified data
Alfons et al. Tandem clustering with invariant coordinate selection
Manso M., A Package for Stationary Time Series Clustering
Douzal-Chouakria et al. A random-periods model for the comparison of a metrics efficiency to classify cell-cycle expressed genes
Cao et al. A trend pattern assessment approach to microarray gene expression profiling data analysis
Tomarchio et al. Mixtures of Contaminated Matrix Variate Normal Distributions
Dewaskar High-Dimensional Problems in Statistics and Probability: Correlation Mining and Distributed Load Balancing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210629