CN116230247A - Data analysis method, device, electronic equipment and storage medium - Google Patents

Data analysis method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116230247A
CN116230247A CN202310518291.2A CN202310518291A CN116230247A CN 116230247 A CN116230247 A CN 116230247A CN 202310518291 A CN202310518291 A CN 202310518291A CN 116230247 A CN116230247 A CN 116230247A
Authority
CN
China
Prior art keywords
data
analysis
analysis method
analyzed
analyzing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310518291.2A
Other languages
Chinese (zh)
Inventor
李妍
成晓亮
周岳
张伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Pinsheng Medical Laboratory Co ltd
Shanghai Ammonia Biotechnology Co ltd
Nanjing Pinsheng Medical Technology Co ltd
Original Assignee
Nanjing Pinsheng Medical Laboratory Co ltd
Shanghai Ammonia Biotechnology Co ltd
Nanjing Pinsheng Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Pinsheng Medical Laboratory Co ltd, Shanghai Ammonia Biotechnology Co ltd, Nanjing Pinsheng Medical Technology Co ltd filed Critical Nanjing Pinsheng Medical Laboratory Co ltd
Priority to CN202310518291.2A priority Critical patent/CN116230247A/en
Publication of CN116230247A publication Critical patent/CN116230247A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Epidemiology (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Pathology (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Automatic Analysis And Handling Materials Therefor (AREA)

Abstract

The invention discloses a data analysis method, a data analysis device, electronic equipment and a storage medium. The method comprises the following steps: acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information; determining the data type of each data in the data to be analyzed, and determining a data analysis method based on the data type of each data in the data to be analyzed; and analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart. According to the technical scheme, the data analysis method is automatically determined, and then the data to be analyzed is automatically analyzed according to the automatically determined data analysis method, so that an analysis result chart is obtained, the integrated automatic analysis from the data to the chart is realized, and the data analysis efficiency is improved.

Description

Data analysis method, device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data analysis method, a data analysis device, an electronic device, and a storage medium.
Background
With the development of big data technology, data with health care as a core is rapidly increasing, and with the increase of data volume and data types, how to perform efficient and convenient multi-group data processing becomes a new challenge.
The current data analysis software has single function, and a user can obtain a complete analysis result by means of a plurality of pieces of software.
In the process of implementing the present invention, the inventor finds that at least the following technical problems exist in the prior art: in the prior art, the problem of low data analysis efficiency exists.
Disclosure of Invention
The invention provides a data analysis method, a data analysis device, electronic equipment and a storage medium, so as to improve the efficiency of data analysis.
According to an aspect of the present invention, there is provided a data analysis method including:
acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information;
determining the data type of each data in the data to be analyzed, and determining a data analysis method based on the data type of each data in the data to be analyzed;
and analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
According to another aspect of the present invention, there is provided a data analysis apparatus comprising:
the analysis data acquisition module is used for acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information;
The analysis method determining module is used for determining the data type of each data in the data to be analyzed and determining a data analysis method based on the data type of each data in the data to be analyzed;
and the result chart determining module is used for analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data analysis method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a data analysis method according to any one of the embodiments of the present invention.
According to the technical scheme, the data to be analyzed is obtained, wherein the data to be analyzed comprises at least two of proteomics data, metabonomics data, lipidomics data and clinical information, so that the acquisition of various combined data is realized, the data types of all data in the data to be analyzed are further determined, the data analysis method is determined based on the data types of all data in the data to be analyzed, the automatic determination of the data analysis method is realized, the data to be analyzed is further analyzed based on the data analysis method which is automatically determined, an analysis result chart is obtained, the integrated automatic analysis from the data to the chart is realized, and the data analysis efficiency is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data analysis method according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a data analysis method according to a second embodiment of the present invention;
FIG. 3 is a flow chart of a data analysis method according to a third embodiment of the present invention;
FIG. 4 is a flow chart of a data analysis method according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data analysis device according to a fifth embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device implementing a data analysis method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus, for example, obtaining, storing, using, processing, etc. data in accordance with the relevant legal provisions of the state's law.
Prior to describing the specific embodiments, the background of the invention will be described in detail. In the prior art, in the use stage of the multi-mathematics data analysis method, the algorithm for multi-mathematics data analysis has more adjustable parameters, relates to a plurality of statistical backgrounds, and most of the multi-mathematics data analysis methods do not have user-friendly use interfaces, are also in a stage suitable for the languages of developers such as Python language, R language and the like, and have higher requirements on the capability of the user in terms of computers because the user is required to install an adaptation program and correspondingly adjust codes according to the data condition.
The user needs a great deal of learning cost and manual data processing time cost when using a multi-group analysis method; if multiple multi-set analysis methods are to be implemented, the difficulty in implementing analysis, comparison and comparison of the existing data is greater from multiple angles. Specifically, different histology data are different in analysis methods, and experimenters without code capability need to learn the use methods of a large number of websites, but only the most basic charts can be obtained; meanwhile, part of the analysis method for the data of the packet is that only the example codes can be downloaded, and the example codes and various parameters can be adjusted according to the data of the packet, so that people without code capacity cannot use the analysis method.
Therefore, the multi-mathematics data analysis lacks a set at present, is user-friendly, can be used for combining different data types, automatically integrating data, adapting to a plurality of multi-mathematics calculation methods and synchronously drawing a whole set of one-key analysis tools.
The invention provides a data analysis method, a device, electronic equipment and a storage medium for comprehensively processing multiple groups of study and clinical information, which reduce a great amount of time spent on manually processing proteomics data, metabonomics data, lipidomics data and clinical information, greatly reduce the analysis difficulty and threshold of the multiple groups of study and multiple analysis methods, do not need a user to have computer and statistics background, can be used by one key, and can more intuitively understand and compare data analysis results through abundant drawing display forms.
Example 1
Fig. 1 is a flowchart of a data analysis method according to a first embodiment of the present invention, where the method may be applied to automatically analyzing multiple groups of data, and the method may be performed by a data analysis device, where the data analysis device may be implemented in hardware and/or software, and the data analysis device may be configured in a computer terminal, a server, or other devices. As shown in fig. 1, the method includes:
S110, acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information.
In this embodiment, the data to be analyzed refers to data to be analyzed for multiple sets of chemical data, and may include, but is not limited to, at least two of proteomic data, metabonomic data, lipidomic data and clinical information, in other words, if no clinical information exists, two or more sets of chemical data need to be input by default; if there is clinical information, one or more kinds of histology data can be input.
Specifically, each group of data in the data to be analyzed can be obtained from a corresponding database or from an external input, and clinical information in the data to be analyzed can be obtained from a preset storage position of an electronic device or other devices connected with the electronic device, for example, the electronic device can be an electronic computer in a hospital or the like.
S120, determining the data type of each data in the data to be analyzed, and determining a data analysis method based on the data type of each data in the data to be analyzed.
In this embodiment, the data types are used to characterize the types of each data in the data to be analyzed, and may include proteomic data types, metabolomic data types, lipidomic data types, and clinical information types.
It should be noted that, by determining the data type of each data in the data to be analyzed, the composition of the data to be analyzed can be known, so that the data analysis method suitable for the data to be analyzed can be selected later. For example, if a protein input file is detected, the data of the proteomic data type exists in the data to be analyzed, wherein the protein input file contains information such as a protein registration number (protein accessions number); if the metabolic input file is detected, the fact that the metabonomics data type exists in the data to be analyzed is indicated, wherein the metabolic input file contains information such as Name and Compound ID.
For example, if a proteomic data input is detected, determining that the data to be analyzed contains data of a proteomic data type, and assigning TRUE to a proteomic data corresponding variable, and if a proteomic data input is not detected, determining that the data to be analyzed does not contain data of a proteomic data type, and assigning FALSE to a proteomic data corresponding variable; the detection of other histology data and clinical information is the same as the detection of proteomics data, and will not be described here again. After the data type of each data in the data to be analyzed is detected, a data analysis method suitable for the data to be analyzed can be selected according to the data type of each data so as to aim at the data to be analyzed.
S130, analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
In this embodiment, the analysis result chart refers to a visual chart obtained by converting the data to be analyzed, so that the data to be analyzed is displayed more intuitively. For example, the analysis result chart may be a network chart, a line graph, a histogram, a dot graph, etc., and the display form of the analysis result chart may be a static chart and/or a dynamic interaction chart, where the dynamic interaction chart may be an offline html interaction network chart.
Specifically, one or more analysis result charts can be obtained and displayed by analyzing the data to be analyzed through one or more of a correlation analysis method, a cluster analysis method, an enrichment analysis method and an MOFA analysis method.
It should be noted that, the data analysis method provided in this embodiment reduces the time spent on manually processing proteomes, metabolome, lipidome and clinical data, and the data analysis method can be used by one key, so that the user does not need to have code capability, the analysis difficulty and threshold of the multiple-study analysis method are reduced, and the analysis result chart can be displayed, so that the user can more intuitively understand and compare the analysis data.
According to the technical scheme, the data to be analyzed is obtained, wherein the data to be analyzed comprises at least two of proteomics data, metabonomics data, lipidomics data and clinical information, so that the acquisition of various combined data is realized, the data types of all data in the data to be analyzed are further determined, the data analysis method is determined based on the data types of all data in the data to be analyzed, the automatic determination of the data analysis method is realized, the data to be analyzed is further analyzed based on the data analysis method which is automatically determined, an analysis result chart is obtained, the integrated automatic analysis from the data to the chart is realized, and the data analysis efficiency is improved.
Example two
Fig. 2 is a flowchart of a data analysis method according to a second embodiment of the present invention, where the method according to the present embodiment may be combined with each of the alternatives in the data analysis method provided in the foregoing embodiment. The data analysis method provided by the embodiment is further optimized. Optionally, the acquiring data to be analyzed includes: acquiring histology data and/or clinical information; performing data processing on the omic data through a preconfigured omic data processing method to obtain omic processing data, wherein the omic data processing method comprises one or more of double-tail student T test, multiple test correction and single-factor variance analysis; carrying out clinical index significance analysis on the clinical information to obtain clinical analysis information; and merging the histologic processing data with the clinical analysis information to obtain data to be analyzed.
As shown in fig. 2, the method includes:
s210, acquiring histology data and/or clinical information.
In this embodiment, the histology data may include, but is not limited to, proteomics quantitative files, metabonomics quantitative files, and lipidomics quantitative files.
For example, the proteomic quantification file may be a peptide fragment quantification result text file output by the search software Proteome Discoverer, maxQuant, spectronaut or DIA-NN, etc. The metabonomics quantitative file may be CD (Compound Discoverer) or a file exported by MS-DIAL search software. The lipidomic quantitative file may be a file output by the library search software.
In this embodiment, the clinical information may be data in a tabular form. Illustratively, a first column in the clinical information may be a sample identification, a second column a sample grouping, and the other columns a clinical index column.
S220, performing data processing on the omic data through a preconfigured omic data processing method to obtain the omic processing data, wherein the omic data processing method comprises one or more of double-tail student T test, multiple test correction and single-factor variance analysis.
For example, a two-tailed student T test may be performed on a single omics data to obtain a P value (P value), and then a multiple test (Benjamini & Hochberg, BH) correction method may be used to correct the P value to obtain a Q value (Q value), where the P value is a parameter used to determine a hypothesis test result, and the Q value is a result after the P value test. If the histologic data exceeds two groups, significance is determined by using one-factor analysis of variance and correction is performed by using a BH correction method, so that the histologic processing data is obtained.
S230, performing clinical index significance analysis on the clinical information to obtain clinical analysis information.
For example, a two-tailed student T test may be performed between each two comparison groups for each clinical indicator in the clinical information, comparing whether the difference in the average of the two groups is significant, resulting in clinical analysis information.
In some embodiments, clinical information may also be analyzed for variance (Analysis of Variance, ANOVA) to verify the significance of clinical indicators, and Excel tables may be generated for comparison and review.
And S240, merging the histology processing data with the clinical analysis information to obtain data to be analyzed.
After the completion of the processing of the histologic data and the processing of the clinical information, the data to be analyzed can be obtained by combining the histologic processing data with the clinical analysis information.
S250, determining the data type of each data in the data to be analyzed, and determining a data analysis method based on the data type of each data in the data to be analyzed.
And S260, analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
According to the technical scheme, the histology data and/or the clinical information are obtained, and then the data processing is carried out on the histology data through a preconfigured histology data processing method to obtain histology processing data, wherein the histology data processing method comprises one or more of double-tail student T test, multiple test correction and single-factor analysis of variance, and then the clinical information is subjected to clinical index significance analysis to obtain clinical analysis information, and the histology processing data and the clinical analysis information are combined to obtain data to be analyzed so as to analyze the data later.
Example III
Fig. 3 is a flowchart of a data analysis method according to a third embodiment of the present invention, where the method according to the present embodiment may be combined with each of the alternatives in the data analysis method provided in the foregoing embodiment. The data analysis method provided by the embodiment is further optimized. Optionally, the data analysis method includes one or more of a correlation analysis method, a cluster analysis method, an enrichment analysis method, and a MOFA analysis method.
As shown in fig. 3, the method includes:
s310, acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information.
S320, determining the data type of each data in the data to be analyzed, and determining a data analysis method based on the data type of each data in the data to be analyzed, wherein the data analysis method comprises one or more of a correlation analysis method, a cluster analysis method, an enrichment analysis method and a MOFA analysis method.
In this embodiment, the association analysis method may be used to mine the intermolecular data relationships of different groups. Cluster analysis methods can be used to determine co-expressed molecules and inter-molecular regulatory relationships between the data. The enrichment analysis method can be used for carrying out distribution inspection on the data to be analyzed. The MOFA analysis method can be used for integrating data to be analyzed.
S330, analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
It should be noted that, in the data analysis method, in the process of analyzing the data to be analyzed, a record file may be generated, so that the analysis process is traceable; in some embodiments, data processing intermediate files and operation parameters can be reserved, so that subsequent analysis and modification of analysis result charts are facilitated; in some embodiments, the Excel table and the visual results of various pluggable images can be obtained through each data analysis method, so that the user can quickly call.
Based on the above embodiments, optionally, the analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart includes: and analyzing the data to be analyzed based on a correlation analysis method to obtain an analysis result chart, wherein the correlation analysis method comprises one or more of an interaction network analysis method, a bi-directional orthogonal partial least square and a correlation determination method.
In this example, an interaction network refers to a network that is capable of characterizing the state of interaction between metabolism, lipids and proteins. Specifically, the identified molecules of each group can be searched from the relational library, biological relations are extracted, and then the dynamic network diagram in the html format is generated. It should be noted that, if there is a reaction relationship between protein, metabolism and lipid, a connection exists between two points in the network diagram, so that an interaction relationship is shown. In the network diagram, metabolism may be represented by blue circles, yellow circles representing proteins, and larger circles represent more molecules associated therewith.
Bi-directional orthogonal partial least squares (O2 PLS) can mine the internal links between two sets of data from the overall perspective of the two sets of data, determine the degree of association of the two sets of data, and determine the major proteins, metabolites or lipids that cause such association. It should be noted that, after the bi-directional Orthogonal partial least squares analysis, each histology may be divided into associated parts (join part, there are commonly varying parts in both groups), orthogonal parts (Orthogonal parts, uncorrelated parts in both groups) and Noise parts (Noise parts, some redundant information). The coefficients of the relevant parts obtained by O2PLS analysis can be used for drawing a load diagram to assist in explaining the analysis result, meanwhile, according to the type of the input histology data, an example diagram of the ratio of each part of two histology is automatically generated, the two histology load diagrams can be combined on the same diagram and distinguished by different colors and shapes, and the display is more visual and convenient.
The correlation determination method is to perform single-variable linear fitting on clinical information and histology data, and determine the proportion of the regression relation which is interpreted by a single independent variable. And analyzing the data to be analyzed by a correlation determination method, so that a chord chart can be obtained, the wider the chord chart line width is, the higher the correlation degree is, the blue color can represent the negative correlation, and the red color can represent the positive correlation. Besides the chord graph, according to the distribution condition of the R value, all clinical information can be integrated, and the R value density graph can be drawn.
Based on the above embodiments, optionally, the analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart includes: analyzing the data to be analyzed based on a cluster analysis method to obtain an analysis result chart, wherein the cluster analysis method comprises one or more of an Mfuzz multi-group analysis method and a WGCNA analysis method.
In the embodiment, the Mfuzz multi-group analysis method can determine the change trend of multi-group consistency, provide data support for subsequent enrichment analysis and multi-group correlation analysis, and can find out the biological functions executed by a single class.
Specifically, the difference molecular quantitative data of all comparison groups in the data to be analyzed are combined, the median value of the quantitative value of each group in the group is calculated according to the sample group, and the median value is used as the input data of the expression pattern clustering by screening the variation coefficient larger than 0.1. Firstly, sequentially sequencing sample groups of input data according to time sequences or different processing degrees, obtaining membership degrees of molecules of each group to all class centers by optimizing an objective function by using a Fuzzy C-Means Clustering (FCM) Fuzzy Clustering algorithm, thereby determining the class of the molecules in the group so as to achieve the purpose of automatically classifying the group data, dividing the expression change trend of the molecules in the group into given classes according to the given classes, wherein the expression trend of the molecules in the same class is the same, and obtaining membership values to represent the probability of the class of the molecules. According to the same trend, the classification of molecules is obtained, the protein, metabolism and lipid molecules with the same trend are classified into the same cluster group, the classification result of data is stored, the fuzzy clustering result is visually displayed in a combined form of a heat map and a line graph, and the membership value is represented by using gradient colors.
Furthermore, the method can separate data of different groups, record information such as grouping and the like, perform corresponding enrichment analysis on each class in single group, characterize the biological process participated in under the same expression trend, and can use different display forms such as bubble diagrams, heat maps, line diagrams and the like.
It should be noted that, the Mfuzz method may be used for clustering, so as to divide molecules of multiple groups into clusters with the same trend and with a user-defined number, and after the division is completed, the bubble chart and the corresponding enrichment result table may be obtained according to the result after the enrichment analysis, where the enrichment analysis includes multiple different analysis methods, and the enrichment analysis is not limited herein. The heat map may comprise a plurality of different heat maps, for example the heat map may comprise a first heat map and a second heat map, the first heat map may be derived from Mfuzz results and the second heat map may be derived from different comparison sets of ratios to a baseline. The line graph is related to Mfuzz display. Unlike enrichment analysis of marked locations, a single set of separate enrichments is still used here, i.e. clusters are marked using Mfuzz method only, after which each set of both the histological and clinical data can be separated and processed separately. In some embodiments, a single set of enrichment methods may be added to the subsequent enrichment analysis for subsequent data analysis.
In order to facilitate the study of relationships between the various groups in a single class and thereby to discover the relationship between upstream and downstream in a biological process, the results obtained after the person/spline correlation calculation of the various group data belonging to a clustered group are plotted as an interactive multi-group interaction network analysis graph.
The WGCNA analysis method is an analysis method applying the expression patterns of a plurality of sample genes, can cluster genes with similar expression patterns, and analyzes the association relationship between a module and a specific character or phenotype. In the embodiment, whether the coexpression phenomenon exists between each protein molecule and each metabolic molecule is found by using a WGCNA analysis method, the coexpression phenomenon can be clustered into the same module, key molecules in the coexpression phenomenon can be found, and the visualization of analysis results is realized.
Based on the above embodiments, optionally, the analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart includes: analyzing the data to be analyzed based on an enrichment analysis method to obtain an analysis result chart, wherein the enrichment analysis method comprises one or more of a single-group enrichment method, a multi-group common enrichment method and a gene set enrichment analysis.
In this embodiment, the method of enrichment in a single set refers to a method of enrichment analysis for each single set.
The multi-group chemical common enrichment method uses detected protein, metabolism and lipid molecules as background, and differential molecules as prospect. In particular, hypergeometric profiles may be used
Figure SMS_1
The significance of a certain functional class in a group of differential expression molecules is tested, and the protein metabolism functional class with low false positive rate, which is obviously related to the experimental purpose, is obtained through the significance analysis, enrichment analysis and false positive analysis of discrete distribution. Where N is the number of path notes in all molecules; n is the number of differentially expressed molecules in N; m is the number of proteins in all molecules annotated as a particular path; m is the number of differentially expressed molecules annotated as a particular path. Enrichment results significance was expressed as p value, corrected using BH method to obtain p.adjust value, then p.adjust<0.05 is a threshold and screening for path meeting this condition is a significant enrichment result. And is visually displayed using a bubble map and a bar graph.
In this example, the specific steps of the gene set enrichment analysis (Gene Set Enrichment Analysis, GSEA) include: the degree of differential expression of the molecules in the two samples was ranked and then a determination was made as to whether the database for GSEA tended to aggregate at the top or bottom of the ordered list, thereby determining whether the set of molecules was statistically significant between the two comparison sets.
Based on the above embodiments, optionally, the analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart includes: and analyzing the data to be analyzed based on the MOFA analysis method to obtain an analysis result chart.
In this embodiment, the MOFA is a multi-group data integration method, which is based on a Bayesian group factor analysis framework, and presents the overall variability of each group of molecular features with lower-dimensional potential factors. The method is applicable to a wide range of data types, such as discrete, continuous, binary and the like, can be used for large-sample or small-sample data, can be used for missing value estimation, provides a new idea for multi-group data mining from the factor angle, and has a wide downstream analysis method and rich mapping modes, wherein the method comprises variance decomposition, feature weight inspection, enrichment analysis, sample clustering analysis, potential factor visualization, a molecular correlation heat map with high contribution degree in factors, a correlation network map, an expression quantity heat map and the like.
In some embodiments, the databases may include, but are not limited to, a multi-set of chemical co-enrichment databases, a KEGG-based database of biological response relationships, and a multi-set of chemical databases suitable for GSEA analysis.
The multi-group-science common enrichment database is a multi-group-science database based on a Fisher accurate test enrichment method, the database realizes common enrichment of various group-science data, and an enrichment result can be compared with a single-group-science enrichment result contrast, so that the molecular mechanism of biological functions can be found more easily. The database of the biological reaction relation based on the KEGG can be used for extracting the protein, metabolism and lipid reaction relation, establishing a plurality of groups of chemical reaction relation libraries, drawing a plurality of groups of chemical reaction network diagrams through the database of the biological reaction relation based on the KEGG, and determining the regulation and control relation among the groups. The multiple sets of chemical databases suitable for GSEA analysis are those set and stored in gmt format and gmx format for GSEA analysis methods.
According to the technical scheme provided by the embodiment of the invention, the data to be analyzed is analyzed through one or more of the association analysis method, the cluster analysis method, the enrichment analysis method and the MOFA analysis method, so that an analysis result chart is obtained, automatic generation of the chart is realized, and the data analysis efficiency is improved.
Example IV
Fig. 4 is a flowchart of a data analysis method according to a fourth embodiment of the present invention, where the method according to the present embodiment may be combined with each of the alternatives in the data analysis method provided in the foregoing embodiment. The data analysis method provided by the embodiment is further optimized. Optionally, the determining a data analysis method based on the data type of each data in the data to be analyzed includes: if the data type comprises the data types of a plurality of group study data, determining the data analysis method as a multiple group study data analysis method; if the data type comprises clinical information and the data type of single histology data, determining a data analysis method as a data fitting analysis method; if the data type comprises clinical information and data types of a plurality of histology data, determining that the data analysis method is a multi-histology data analysis method and/or a data fitting analysis method; correspondingly, the analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart comprises the following steps: and analyzing the data to be analyzed based on the multiple groups of chemical data analysis methods and/or the data fitting analysis methods to obtain an analysis result chart.
As shown in fig. 4, the method includes:
s410, acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information.
S420, determining the data type of each data in the data to be analyzed, and if the data type comprises the data types of a plurality of group study data, determining the data analysis method as a multi-group study data analysis method; if the data type comprises clinical information and the data type of single histology data, determining a data analysis method as a data fitting analysis method; if the data type comprises clinical information and data types of a plurality of histology data, determining the data analysis method as a multi-histology data analysis method and/or a data fitting analysis method.
In this embodiment, the multiple-omic data analysis method refers to a method for performing analysis processing on multiple-omic data. Exemplary multi-set mathematical data analysis methods may include, but are not limited to, mfuzz multi-set mathematical analysis methods, bi-directional orthorhombic partial least squares, and the like. The data fitting analysis method refers to a method for fitting analysis of clinical information and histology data. By way of example, data fitting analysis methods may include, but are not limited to, correlation determination methods, WGCNA analysis methods, and the like.
S430, analyzing the data to be analyzed based on the multiple groups of chemical data analysis method and/or the data fitting analysis method to obtain an analysis result chart.
For example, if the data type includes clinical information and data types of a plurality of omics data, determining that the data analysis method is an Mfuzz multi-set analysis method, a bi-orthogonal partial least squares, a correlation determination method, and a WGCNA analysis method; and further, based on an Mfuzz multi-group analysis method, a bidirectional orthogonal partial least square, a correlation determination method and a WGCNA analysis method, respectively analyzing the data to be analyzed to obtain an analysis result chart corresponding to each analysis method.
According to the technical scheme provided by the embodiment of the invention, the data analysis method is determined according to the data type of each data in the data to be analyzed, so that the automatic determination of the data analysis method is realized, and a guarantee is provided for realizing automatic analysis of the data.
Example five
Fig. 5 is a schematic structural diagram of a data analysis device according to a fifth embodiment of the present invention. As shown in fig. 5, the apparatus includes:
an analysis data acquisition module 510, configured to acquire data to be analyzed, where the data to be analyzed includes at least two of proteomic data, metabonomic data, lipidomic data, and clinical information;
An analysis method determining module 520, configured to determine a data type of each data in the data to be analyzed, and determine a data analysis method based on the data type of each data in the data to be analyzed;
the result chart determining module 530 is configured to analyze the data to be analyzed based on the data analysis method, so as to obtain an analysis result chart.
According to the technical scheme, the data to be analyzed is obtained, wherein the data to be analyzed comprises at least two of proteomics data, metabonomics data, lipidomics data and clinical information, so that the acquisition of various combined data is realized, the data types of all data in the data to be analyzed are further determined, the data analysis method is determined based on the data types of all data in the data to be analyzed, the automatic determination of the data analysis method is realized, the data to be analyzed is further analyzed based on the data analysis method which is automatically determined, an analysis result chart is obtained, the integrated automatic analysis from the data to the chart is realized, and the data analysis efficiency is improved.
In some alternative embodiments, the analysis data acquisition module 510 is further configured to:
acquiring histology data and/or clinical information;
Performing data processing on the omic data through a preconfigured omic data processing method to obtain omic processing data, wherein the omic data processing method comprises one or more of double-tail student T test, multiple test correction and single-factor variance analysis;
carrying out clinical index significance analysis on the clinical information to obtain clinical analysis information;
and merging the histologic processing data with the clinical analysis information to obtain data to be analyzed.
In some alternative embodiments, the data analysis method includes one or more of a correlation analysis method, a cluster analysis method, an enrichment analysis method, and a MOFA analysis method.
In some alternative embodiments, the result chart determination module 530 is further configured to:
and analyzing the data to be analyzed based on the association analysis method to obtain an analysis result chart, wherein the association analysis method comprises one or more of an interaction network analysis method, a bi-directional orthogonal partial least square method and a correlation determination method.
In some alternative embodiments, the result chart determination module 530 is further configured to:
and analyzing the data to be analyzed based on the cluster analysis method to obtain an analysis result chart, wherein the cluster analysis method comprises one or more of an Mfuzz multi-group analysis method and a WGCNA analysis method.
In some alternative embodiments, the result chart determination module 530 is further configured to:
analyzing the data to be analyzed based on the enrichment analysis method to obtain an analysis result chart, wherein the enrichment analysis method comprises one or more of a single-group enrichment method, a multi-group common enrichment method and a gene set enrichment analysis.
In some alternative embodiments, the result chart determination module 530 is further configured to:
and analyzing the data to be analyzed based on the MOFA analysis method to obtain an analysis result chart.
In some alternative embodiments, the analysis method determination module 520 is further configured to:
if the data type comprises the data types of a plurality of group study data, determining the data analysis method as a multiple group study data analysis method;
if the data type comprises clinical information and the data type of single histology data, determining a data analysis method as a data fitting analysis method;
if the data type comprises clinical information and data types of a plurality of histology data, determining that the data analysis method is a multi-histology data analysis method and/or a data fitting analysis method;
accordingly, the result chart determination module 530 is further configured to:
And analyzing the data to be analyzed based on the multiple groups of chemical data analysis methods and/or the data fitting analysis methods to obtain an analysis result chart.
The data analysis device provided by the embodiment of the invention can execute the data analysis method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example six
Fig. 6 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smartphones, wearable devices (e.g., helmets, eyeglasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 6, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An I/O interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as a data analysis method, which includes:
acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information;
Determining the data type of each data in the data to be analyzed, and determining a data analysis method based on the data type of each data in the data to be analyzed;
and analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
In some embodiments, the data analysis method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the data analysis method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the data analysis method in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (11)

1. A method of data analysis, comprising:
acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information;
determining the data type of each data in the data to be analyzed, and determining a data analysis method based on the data type of each data in the data to be analyzed;
and analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
2. The method of claim 1, wherein the acquiring the data to be analyzed comprises:
acquiring histology data and/or clinical information;
performing data processing on the omic data through a preconfigured omic data processing method to obtain omic processing data, wherein the omic data processing method comprises one or more of double-tail student T test, multiple test correction and single-factor variance analysis;
carrying out clinical index significance analysis on the clinical information to obtain clinical analysis information;
and merging the histologic processing data with the clinical analysis information to obtain data to be analyzed.
3. The method of claim 1, wherein the data analysis method comprises one or more of a correlation analysis method, a cluster analysis method, an enrichment analysis method, and a MOFA analysis method.
4. A method according to claim 3, wherein said analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart comprises:
and analyzing the data to be analyzed based on the association analysis method to obtain an analysis result chart, wherein the association analysis method comprises one or more of an interaction network analysis method, a bi-directional orthogonal partial least square method and a correlation determination method.
5. A method according to claim 3, wherein said analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart comprises:
and analyzing the data to be analyzed based on the cluster analysis method to obtain an analysis result chart, wherein the cluster analysis method comprises one or more of an Mfuzz multi-group analysis method and a WGCNA analysis method.
6. A method according to claim 3, wherein said analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart comprises:
analyzing the data to be analyzed based on the enrichment analysis method to obtain an analysis result chart, wherein the enrichment analysis method comprises one or more of a single-group enrichment method, a multi-group common enrichment method and a gene set enrichment analysis.
7. A method according to claim 3, wherein said analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart comprises:
and analyzing the data to be analyzed based on the MOFA analysis method to obtain an analysis result chart.
8. The method according to claim 1, wherein the determining a data analysis method based on the data type of each of the data to be analyzed includes:
if the data type comprises the data types of a plurality of group study data, determining the data analysis method as a multiple group study data analysis method;
if the data type comprises clinical information and the data type of single histology data, determining a data analysis method as a data fitting analysis method;
if the data type comprises clinical information and data types of a plurality of histology data, determining that the data analysis method is a multi-histology data analysis method and/or a data fitting analysis method;
correspondingly, the analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart comprises the following steps:
and analyzing the data to be analyzed based on the multiple groups of chemical data analysis methods and/or the data fitting analysis methods to obtain an analysis result chart.
9. A data analysis device, comprising:
the analysis data acquisition module is used for acquiring data to be analyzed, wherein the data to be analyzed comprises at least two of proteomic data, metabonomic data, lipidomic data and clinical information;
the analysis method determining module is used for determining the data type of each data in the data to be analyzed and determining a data analysis method based on the data type of each data in the data to be analyzed;
and the result chart determining module is used for analyzing the data to be analyzed based on the data analysis method to obtain an analysis result chart.
10. An electronic device, the electronic device comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data analysis method of any one of claims 1-8.
11. A computer readable storage medium storing computer instructions for causing a processor to perform the data analysis method of any one of claims 1-8.
CN202310518291.2A 2023-05-10 2023-05-10 Data analysis method, device, electronic equipment and storage medium Pending CN116230247A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310518291.2A CN116230247A (en) 2023-05-10 2023-05-10 Data analysis method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310518291.2A CN116230247A (en) 2023-05-10 2023-05-10 Data analysis method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116230247A true CN116230247A (en) 2023-06-06

Family

ID=86589634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310518291.2A Pending CN116230247A (en) 2023-05-10 2023-05-10 Data analysis method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116230247A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684383A (en) * 2018-12-25 2019-04-26 广州天鹏计算机科技有限公司 Acquisition methods, device, computer equipment and the storage medium of data analysis result
CN110570905A (en) * 2019-07-22 2019-12-13 中国人民解放军总医院 method and device for constructing omics data analysis platform and computer equipment
CN110795458A (en) * 2019-10-08 2020-02-14 北京百分点信息科技有限公司 Interactive data analysis method, device, electronic equipment and computer readable storage medium
CN111383768A (en) * 2018-12-28 2020-07-07 医渡云(北京)技术有限公司 Regression analysis method and device for medical data, electronic equipment and readable medium
CN113130021A (en) * 2019-12-31 2021-07-16 贵州医渡云技术有限公司 Clinical data analysis method and device, readable medium and electronic equipment
WO2021210838A1 (en) * 2020-04-14 2021-10-21 주식회사 클리노믹스 Method and system for predicting biological age on basis of various omics data analyses
CN115985431A (en) * 2022-12-20 2023-04-18 北京嘉和海森健康科技有限公司 Statistical analysis method, system, equipment and storage medium of clinical data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684383A (en) * 2018-12-25 2019-04-26 广州天鹏计算机科技有限公司 Acquisition methods, device, computer equipment and the storage medium of data analysis result
CN111383768A (en) * 2018-12-28 2020-07-07 医渡云(北京)技术有限公司 Regression analysis method and device for medical data, electronic equipment and readable medium
CN110570905A (en) * 2019-07-22 2019-12-13 中国人民解放军总医院 method and device for constructing omics data analysis platform and computer equipment
CN110795458A (en) * 2019-10-08 2020-02-14 北京百分点信息科技有限公司 Interactive data analysis method, device, electronic equipment and computer readable storage medium
CN113130021A (en) * 2019-12-31 2021-07-16 贵州医渡云技术有限公司 Clinical data analysis method and device, readable medium and electronic equipment
WO2021210838A1 (en) * 2020-04-14 2021-10-21 주식회사 클리노믹스 Method and system for predicting biological age on basis of various omics data analyses
CN115985431A (en) * 2022-12-20 2023-04-18 北京嘉和海森健康科技有限公司 Statistical analysis method, system, equipment and storage medium of clinical data

Similar Documents

Publication Publication Date Title
CN107368700A (en) Based on the microbial diversity interaction analysis system and method for calculating cloud platform
CN110377704B (en) Data consistency detection method and device and computer equipment
CN107077489A (en) Automatic for multidimensional data is seen clearly
CN110909222B (en) User portrait establishing method and device based on clustering, medium and electronic equipment
CN111581092A (en) Method for generating simulation test data, computer device and storage medium
CN113377486A (en) Data visualization display method, device, equipment and storage medium
KR102565798B1 (en) Method and device for extracting spatial relationship of geographic location points
CN111339290A (en) Text classification method and system
CN116230247A (en) Data analysis method, device, electronic equipment and storage medium
CN116450827A (en) Event template induction method and system based on large-scale language model
CN115687352A (en) Storage method and device
CN116089490A (en) Data analysis method, device, terminal and storage medium
CN112328951B (en) Processing method of experimental data of analysis sample
CN114067169A (en) Raman spectrum analysis method based on convolutional neural network
CN109584047B (en) Credit granting method, system, computer equipment and medium
CN113779248A (en) Data classification model training method, data processing method and storage medium
Pham et al. Contimap: Continuous heatmap for large time series data
CN112785000A (en) Machine learning model training method and system for large-scale machine learning system
CN111863136A (en) Integrated system and method for correlation analysis among multiple sets of chemical data
CN114971110A (en) Method for determining root combination, related device, equipment and storage medium
CN112836747A (en) Eye movement data outlier processing method and device, computer equipment and storage medium
CN111984637A (en) Missing value processing method and device in data modeling, equipment and storage medium
CN111046894A (en) Method and device for identifying vest account
CN110807599A (en) Method, device, server and storage medium for deciding electrochemical energy storage scheme
CN114911963B (en) Template picture classification method, device, equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230606