CN113469203A

CN113469203A - Method, electronic device and computer program product for evaluating operation results

Info

Publication number: CN113469203A
Application number: CN202010245570.2A
Authority: CN
Inventors: 李伟; 刘春辰
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2021-10-01

Abstract

A method, electronic device, and computer-readable program medium for evaluating operation results are disclosed. In a method for evaluating operation results, an initial prediction model is established for a set of observation data, wherein the initial prediction model has a hierarchical structure and comprises gated nodes for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods, the observation data comprising individual characteristics of the individuals, respective operations performed for the individuals, and respective operation results; determining parameters of a gating node and an expert node by using observation data to obtain a final prediction model; and evaluating the operation results of the predetermined operation on the individual subgroups matched with the predetermined operation in the observation data by utilizing each expert node in the final prediction model. By the aid of the method and the device, accurate operation result evaluation can be achieved in an automatic mode without adjusting the hyper-parameters, and the method and the device are suitable for various data types and have wider application scenarios.

Description

Method, electronic device and computer program product for evaluating operation results

Technical Field

The present disclosure relates to the field of data mining technology, and more particularly, to a method, electronic device, and computer program product for evaluating operation results.

Background

With the rapid development of information technology, the data size is rapidly increasing. There is enormous useful information in these data, and therefore data mining is receiving more and more extensive attention. For example, causal inference finds widespread application in various fields, such as medical health, education, and ecology, for mining valuable information contained in data. In these areas, it is often difficult to find a regimen that is effective for all individuals. For example, for cancer patients, the therapeutic effect of different treatment regimens will vary from patient to patient; for students to receive educational training, different training programs will have different effects for different students.

One solution to the above problem is to find a better sub-group defined by individual features for the processing scheme. This problem of finding subgroups with different therapeutic effects is called subgroup analysis. Such sub-component analysis helps to explore, for example, heterogeneity in therapeutic effects.

Subgroup analysis methods can be broadly divided into two categories, namely, confirmatory subgroup analysis and exploratory subgroup analysis. The confirmatory subgroup analysis is mainly used to handle a small number of predefined subgroups, whereas the exploratory subgroup analysis is a data-driven way to determine subgroups with different therapeutic effects. In the method of confirmatory subgroup analysis, subgroups are predefined by a professional, which is highly subjective and this subjectivity may directly lead to the confirmatory subgroup analysis causing suspicious results and the possibility of deliberate manipulation of the analysis results. The exploratory subgroup analysis adopts a method based on a tree structure, which is a technique widely concerned at present for identifying heterogeneity, can automatically identify subgroups without advance, and is suitable for the condition of large data volume.

However, both in the verification subgroup analysis and the exploratory subgroup analysis methods may only handle binary processing, and there is a problem that the analysis result is inaccurate. Therefore, there is a need for an improved solution for operation outcome evaluation.

Disclosure of Invention

In view of the above, the present disclosure proposes a method, an electronic device and a computer program product for evaluating an operation result.

According to a first aspect of the present disclosure, a method of evaluating an operation result is provided. The method may comprise establishing an initial prediction model for a set of observation data, wherein the initial prediction model has a hierarchical structure and comprises a gated node for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods, and wherein the observation data comprises a plurality of individual features of a plurality of individuals, respective ones of the predetermined operations performed for the plurality of individuals, and respective operation results. The method also includes determining parameters of a gated node and an expert node of the initial predictive model using the observation data to obtain a final predictive model. The method further comprises the following steps: and predicting the individual subgroups matched with the expert nodes in the final prediction model to determine operation results of the predetermined operation on the individual subgroups.

According to a second aspect of the present disclosure, another method for evaluating operation results is provided. The method may include receiving individual characteristics of one or more individuals and respective ones of predetermined operations performed for the one or more individuals. The method also includes determining one or more subgroups to which the one or more individuals belong based on a prediction model and individual characteristics of the one or more individuals, the prediction model having a hierarchical structure and including a gating node for grouping the individuals and a plurality of different expert nodes for performing predictions based on different prediction methods. The method further includes predicting respective operational outcomes for the one or more individuals using expert nodes in the predictive model associated with the determined one or more sub-groups according to the respective operational modes.

According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes a processor and a memory coupled with the processor, the memory having instructions stored therein that, when executed by the processor, cause the electronic device to perform actions. The actions include: establishing an initial prediction model for a set of observation data, wherein the initial prediction model has a hierarchical structure and comprises a gating node for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods, and wherein the observation data comprises a plurality of individual features of a plurality of individuals, respective ones of the predetermined operations performed for the plurality of individuals, and respective operation results; determining parameters of a gating node and an expert node of the initial prediction model by using the observation data to obtain a final prediction model; and predicting the individual subgroups matched with the expert nodes in the final prediction model in the observation data by utilizing the expert nodes in the final prediction model so as to determine the operation results of the predetermined operation on the individual subgroups.

According to a fourth aspect of the present disclosure, there is also provided another electronic device. The electronic device includes a processor and a memory coupled with the processor, the memory having instructions stored therein that, when executed by the processor, cause the device to perform actions. The actions include: receiving individual characteristics of one or more individuals and corresponding operation modes in predetermined operations performed on the one or more individuals; determining one or more subgroups to which one or more individuals belong based on a prediction model and individual characteristics of the one or more individuals, the prediction model having a hierarchical structure and comprising a gating node for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods; according to the corresponding operation mode, utilizing expert nodes in the prediction model associated with the determined one or more subgroups to predict corresponding operation results for the one or more individuals.

In a fifth aspect of the present disclosure, there is provided a computer readable medium having stored thereon machine executable instructions which, when executed, cause a machine to perform the method according to the first aspect.

In a sixth aspect of the disclosure, there is provided a computer readable medium having stored thereon machine executable instructions which, when executed, cause a machine to perform the method according to the second aspect.

In a seventh aspect of the disclosure, there is provided a computer program product, tangibly stored on a computer-readable medium and comprising machine executable instructions that, when executed, cause a machine to perform the method according to the first aspect.

In an eighth aspect of the disclosure, there is provided a computer program product tangibly stored on a computer-readable medium and comprising machine executable instructions that, when executed, cause a machine to perform the method according to the second aspect.

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the disclosure, nor is it intended to be used to limit the scope of the disclosure.

Drawings

The above and other features of the present disclosure will become more apparent by describing in detail embodiments thereof that are illustrated in the accompanying drawings in which like reference numerals refer to the same or similar parts throughout the drawings. In the drawings:

FIG. 1 schematically illustrates a flow diagram of a method for evaluating operational results according to one embodiment of the present disclosure;

FIG. 2 schematically illustrates a schematic diagram of the structure of an initial predictive model, according to one embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow diagram of a method for optimizing an initial predictive model based on an optimization objective function, according to one embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart of a method for evaluating the results of a predetermined operation on one or more individuals, according to one embodiment of the present disclosure;

FIG. 5 schematically illustrates a block diagram of a system for evaluating operational results according to one embodiment of the present disclosure;

FIG. 6 schematically illustrates an example application of evaluating results of operations of a predetermined operation on a subset of individuals, according to one particular implementation of the present disclosure;

fig. 7 schematically shows a schematic view of an electronic device in which embodiments according to the present disclosure can be implemented.

Detailed Description

Hereinafter, various exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. It should be noted that the drawings and description relate to preferred embodiments by way of example only. It should be noted that from the following description, alternative embodiments of the structures and methods disclosed herein are readily contemplated and may be employed without departing from the principles of the disclosure as claimed in the present disclosure.

It should be understood that these exemplary embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Also in the drawings, optional steps, modules, etc. are shown in dashed boxes for illustrative purposes.

The terms "including," comprising, "and the like, as used herein, are to be construed as open-ended terms, i.e.," including/including but not limited to. The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment". Relevant definitions for other terms will be given in the following description.

In embodiments of the present disclosure, the term "treatment variable/regime" is the manner of operation in which a predetermined operation is performed for each individual under study in an observed experiment, which may be expressed in terms of a level of treatment, e.g., taking a drug or taking a placebo, various doses of a compound drug, etc., various potential strategies such as a course protocol in which the student attends. Herein, T is used to denote a treatment modality, which may also be referred to as a "treatment variable". T is a value T, which may be discrete data or continuous data. For example, for binary processing of two treatment levels, e.g., t ═ 0, 1; for a ternary process with three treatment levels, e.g., t ═ 0,1, 2; for more treatment level processing, and so on. Where the treatment level is continuous, t may be any value within a range of values.

In embodiments of the present disclosure, the term "individual feature" refers to a feature of an individual to be observed, which may also be referred to as an individual attribute. For example, in the case where the individual is a patient, the individual characteristics include, for example, the weight, age, sex, time of illness, relevant medical examination data, and the like of the patient. For the field of education, for example, the age, sex, current level (e.g., english level), whether or not a student has taken a similar course, family economy, etc. may be mentioned. Herein, the individual features are denoted by X, and may also be referred to as "covariates".

In embodiments of the present disclosure, the term "operation result" is a result that has been obtained or predicted/evaluated in the case where a predetermined operation is performed for an individual, and may also be referred to as a "treatment result". For example, in the case where the subject is a patient, the operation result may refer to the development of the condition of the subject, such as the reduction of symptoms of the condition, or the examination of the measurement of the effect, after the administration of a drug or a placebo. For the field of educational medicine, there may be learning effects of students, such as an increase in a relevant level (e.g., a level-up situation for an english level test). Herein, the treatment result is represented using the Y representation, which may also be referred to as "result variable".

In the embodiments of the present disclosure, "observation data" is data obtained in a special observation test, or data accumulated in an actual application. For example, it may be data based on operation results obtained by performing a predetermined operation on an individual in advance, and for the medical field, it may be data obtained by performing observation tests on different patients. N may indicate the number of observation samples, i.e. how many pieces of data are relevant to an individual; d is the dimension of the observed variable, or the number of observed variables, i.e. there are multiple values of the relevant parameter in each piece of data, which is the sum of the number of covariates X, treatment variables T and outcome variables.

In the embodiment of the present disclosure, the term "hidden variable" is a variable that is not observed in the causal relationship inference, but a hidden variable that needs to be known in the causal relationship inference is specifically introduced to solve the causal relationship inference problem in the present embodiment by optimizing a model.

As mentioned previously, in many practical applications it is desirable to be able to predict the effect of a certain operation on one or more subgroups, or to predict the appropriate treatment for a certain individual, so that the computing device can make decisions automatically or assist people in making decisions. In this way, it is possible to automatically determine whether a certain treatment is performed on a certain individual or which of a plurality of treatments is performed on an individual. For example, it may be desirable to predict the likely effect of a certain drug or treatment on a certain patient's condition, to automatically or assist a physician in formulating a treatment regimen. It may also be desirable to predict how well a training course can improve the performance of a student, or to predict the impact of ad pushes on the final purchasing behavior of a consumer, etc. In addition, it is also possible in environmental protection and remediation applications to predict or automatically determine the effectiveness of an environmental protection or remediation program for a given area as the best remediation program is appropriate for that area.

As previously described, existing subgroup analysis can be broadly divided into confirmatory subgroup analysis and exploratory subgroup analysis. However, in the verification subgroup analysis, the number of subgroups to be analyzed and the way in which the subgroups are divided are predefined by the investigator based on his experience, which is highly subjective and directly leads to suspicious results and the possibility of human manipulation of the results.

In contrast, exploratory subgroup analysis is a tree-structure based automated process that is data driven, does not require a predefined number of subgroups and a way in which the subgroups are partitioned, and thus has high objectivity. For illustrative purposes, a prior art solution will be described as an example.

In a prior art sub-set analysis scheme, an initial tree structure is first grown from observation data, where for each node, it will traverse all possible segmentation modes, and segment the node according to the segmentation mode with maximized segmentation statistics, and the same operation will be performed for the left and right sub-nodes of the node. This generates an initial tree structure in which subgroups have been partitioned. Pruning is then performed on the initial tree structure to prune the weakest links of the tree structure to simplify the tree structure. Finally, the tree structure is further reduced in size, wherein the split statistics between each pair of terminal nodes are calculated, the less heterogeneous pairs of nodes are merged, and the split statistics calculation is repeated until all remaining subgroups exhibit excellent heterogeneity. The tree structure obtained at this time is the result analysis model.

However, exploratory sub-group analysis is sensitive to hyper-parameters, which require manual adjustment to achieve better results, and therefore the performance is highly dependent on the setting of the hyper-parameters. Meanwhile, in the related art, both in the verifiable subgroup analysis and the exploratory subgroup analysis methods support only binary processing, which is not suitable for the case of requiring multivariate processing nor for the case of continuous processing prediction based on parameters. To this end, there is a need for an improved operational outcome assessment solution that at least partially addresses the above-mentioned problems of the prior art.

To this end, in the embodiments of the present disclosure, a new approach for evaluating an operation structure is provided. According to the scheme, a layered multi-expert initial prediction model is established for a group of observation data, and then parameters of a gating node and an expert node of the initial prediction model are determined by using the observation data to obtain a final prediction model. Thereafter, the individual subgroups matching the observed data can be predicted by using the expert nodes in the final prediction model to determine the operation results of the predetermined operation on the individual subgroups. Unlike existing subgroup analysis techniques, with embodiments of the present disclosure, it is insensitive to hyper-parameters, and therefore, fully automated operation result evaluation can be achieved without adjusting hyper-parameters. Meanwhile, in the embodiment of the present disclosure, each expert model is suitable for not only the application of binary processing but also multivariate processing application, and also can be suitable for a case where the treatment level is a continuous type, and thus has a wider application scenario.

Hereinafter, the technical solutions disclosed in the present disclosure will be described in detail with reference to the accompanying drawings in conjunction with specific examples. It should be noted, however, that the following description is given for illustrative purposes only, and the patent is not limited thereto.

Fig. 1 schematically shows a schematic diagram of a flow chart of a method for estimating operational effectiveness according to an embodiment of the present disclosure. The steps of the method may be performed collectively by a single processing unit in the electronic device, may be performed individually by a plurality of processing units in the electronic device, and may be performed by a plurality of processing units in a plurality of electronic devices, as long as data transmission is possible therebetween.

As shown in FIG. 1, first at block 110, an initial predictive model is built for a set of observed data. The initial prediction model has a hierarchical structure and includes a gating node for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods. For example, the initial predictive model may be a hierarchical hybrid expert (HME) network model. The observation data is result data including a corresponding operation specified for an individual, and specifically may include a plurality of individual features of a plurality of individuals, a corresponding operation and a corresponding operation result among predetermined operations performed for the plurality of individuals.

The observation data may be data stored in advance in an observation database, or may be imported into the system when evaluation of the operation result is required. The observation data itself may be data from a third party, or data collected by other means. For example, in the medical health field, medical institutions and medical research and development companies perform corresponding observation tests (e.g., drug multiplexing, placebo multiplexing, etc.) on multiple groups of patients/volunteers to obtain observation data. And for example, an educational training institution may be data obtained based on long-term teaching accumulation about the effect of students receiving educational training.

The observation data may be a matrix of N × D, where N is the number of observation samples, i.e., how many pieces of data are present; d is the dimension of the observed variable, or the number of observed variables, i.e., there are multiple related parameters in each piece of data. For example, for the medical health field, N indicates the number of patients in the observed data, and D indicates the number of data related to the patient, including the number of individual features of the patient, the corresponding procedure, and the corresponding outcome. These data may be preprocessed in advance, such as integration of raw data, specification, noise reduction, and so on. These preprocessing operations are known per se in the art and will not be described in further detail here.

Herein, a treatment variable in the observed data, i.e. an operation performed for an individual, is indicated with T; x indicates the observed covariate, i.e. the individual characteristics of the individual, and Y indicates the corresponding operation result. The dimensions of T and X, Y collectively define the dimension D of the observed data. Such as when X is 5, T is 1, Y is 1, and the dimension D of the observed data is 7.

Based on the observed data, a conditional distribution of the operation result Y given the treatment variable T and the covariate X may be established. In this context, it is proposed to employ a hierarchical network model with a plurality of experts for characterization. For example, a hierarchical expert mixture (HME) model may be employed. It should be noted, however, that this model is given for illustrative purposes only and the present disclosure is not limited thereto.

An HME network is an extension of a hybrid expert model, represented in a tree structure, and mixing together a number of different expert models according to a gating function. Specifically, it is a tree structure having a plurality of levels, and its leaf nodes are expert nodes corresponding to a plurality of expert models for prediction based on different expert models, and the other nodes than the leaf nodes are gated nodes indicating grouping determination conditions for grouping observation data.

Aiming at observation data, an initial HME network is established firstly, the initial HME network only comprises an underlying multi-level tree structure, and leaf nodes of the initial HME network are expert nodes. The hierarchy of the HME network and the number of expert nodes may be a predetermined value to meet the requirements of the observed data. On the one hand, the value is adapted to the number of covariates X, while at the same time providing a certain design flexibility. For example, for observed data with 5 covariates, a 6-layer HME network with, for example, 32 expert nodes may be generated. For fewer covariates (e.g., 3), a 5-layer HME network with 16 expert nodes may be employed, for example, while for more covariates (e.g., 10), a 7-layer HME network with 64 expert nodes may be employed.

For illustrative purposes, a schematic diagram of an exemplary HME model is shown in fig. 2, which is a tree-based multi-layer model, and is a 6-layer network structure including 32 expert nodes.

The initial network of HMEs may be represented by a corresponding mathematical expression, however, in practical applications, the model may not necessarily be represented mathematically in the electronic device, but it is possible to construct the network of HMEs by determining parameters associated with the model. As an illustrative simple example, the number of expert nodes and the number of gated nodes, for example, may indicate the structure of the HME initial network.

Regarding the construction and mathematical characterization of the initial prediction model, the following description will be made in detail with reference to a specific example of the HME model, and will not be repeated here.

With continued reference to FIG. 1, next at block 120, parameters of the gating nodes and the expert nodes of the initial predictive model are determined using the observation data to obtain a final predictive model. For example, a reduction may be made to the initial predictive model and parameter values determined for the remaining gated and expert nodes. For example, it is determined to which of the individual features X (1, …, N) each gating node corresponds, what its segmentation point value is; which expert nodes are used in the model, what the parameters of the expert model corresponding to each expert node are.

The initial network model determined in block 110 is only the initial architecture of the network model, and has no information about the individual features corresponding to the gating nodes, no information about the segmentation points (grouping and dividing points) of the individual features, and no parameters in the model corresponding to the expert nodes are determined. In block 120, the structure of the initial model will be reduced using the observation data and the parameters of the nodes in the model are determined.

In one embodiment according to the present disclosure, an optimization objective function for the initial predictive model may be determined and the initial predictive model is optimized based on the observation data and the optimization objective function to determine the final predictive model. .

For example, an optimization objective function for the initial predictive model may be determined based on a decomposition progressive Bayes (FAB) method. And optionally introducing hidden variables in the optimized objective function determined based on the FAB method, wherein the hidden variables indicate whether each piece of observation data T is matched with each expert node in the prediction model. Therefore, in the optimization process, not only parameter values of each gating node and each expert node can be determined, but also observation data matched with each expert can be determined.

For purposes of illustration, FIG. 3 shows a flow diagram of a method for optimizing an initial predictive model based on an optimization objective function according to one embodiment of the present disclosure. As shown in FIG. 3, at block 310, the hidden variables are first optimized based on the optimization objective function to determine optimized probabilities of the observation data matching the respective expert nodes. Next, at block 320, the existing prediction model is reduced based on the determined optimized probabilities to arrive at a reduced prediction model. Then, at block 330, the gated nodes and the expert nodes in the streamlined prediction model may be optimized based on the optimization objective function to determine optimized parameter values for the respective gated nodes and optimized parameter values for the expert nodes to generate an optimized prediction model.

It should be noted that the above steps may be repeatedly performed until the objective function converges, that is, the difference between the two optimization results is within a predetermined threshold range. Or may stop the repeated execution of the above-described operations after a predetermined number of times of the repeated execution has been performed. This results in a final prediction model that can be used to perform the prediction.

The determination of the optimization objective function and the optimization process will be described in detail below with reference to examples, and will not be described herein again.

Next, at block 103, the individual subgroups matching the observation data T may be predicted by using the expert nodes in the final prediction model to determine the operation results of the predetermined operation on the individual subgroups.

As mentioned earlier, in the model structure reduction process, it is performed based on the probability of matching of expert nodes in the structure with respective observed data, that is, in the final prediction model, the matching relationship of the observed data with respective expert nodes has been implied. In the final prediction network, the part of the observation data T that matches each expert node constitutes an individual subgroup. For each individual subgroup, each data therein may be predicted with a respective expert node. Then, the operation result evaluation values of the respective individuals within the subgroup can be integrated as evaluation values of the operation structure of the predetermined operation with respect to the individual subgroup. For example, the average of the operation result evaluation values of the respective individuals may be taken as the final operation result evaluation value for the subgroup. However, it should be noted that the average value is merely an example, and other forms of values may be adopted as the final operation result evaluation value.

It is to be understood that, in the embodiment of the present disclosure, the generated prediction models include respective expert models adapted to the respective groups, and the expert models are not only suitable for binary processing but also suitable for ternary processing, and also can make result evaluations for continuous processing. For example, for a ternary process, three operation result evaluations corresponding to three different treatment levels may be obtained, while for successive process levels, one operation evaluation result curve may be obtained.

The resulting expert model and/or the above-described operation result evaluation may be transmitted to other processing modules in the electronic device or other processing modules in other electronic devices for use in other processes. Such as adjusting the drug formulation, automatically recommending appropriate drug reuse advice, adjusting the course structure settings for educational structures, automatically recommending appropriate drug regimens, and the like.

By the implementation method, automatic grouping of individuals in the observation data can be realized, and the operation result of each individual subgroup in the preset operation can be evaluated. The whole process is data driven and does not depend on any subjective cognition. Moreover, although the hyper-parameters also exist in the prediction model parameter determination process of the present disclosure, it is found through multiple observations that the operation result evaluation according to the present disclosure is not sensitive to the value of the hyper-parameters, that is, the operation result evaluation does not depend on the hyper-parameters, so that the adjustment of the hyper-parameters is not needed, and thus the result is accurate to be influenced by the value of the hyper-parameters. Meanwhile, under the condition of an optimization model based on FAB, an L0 paradigm can be adopted in the optimization process, and the overfitting problem can be further effectively relieved.

In the following, the optimization objective function determination and network optimization procedure will be described in connection with specific examples. It should be noted, however, that this is merely an exemplary method given for illustrative purposes and the present disclosure is not limited thereto.

Furthermore, it should be noted that, in the following description, each process of the prediction model construction and the objective function determination is given for the purpose of illustration, however, it is given for the purpose of explaining a specific principle of the present disclosure, and this does not mean that each step of the above-described process is performed in the solution provided by the present disclosure. Instead, it is more likely that the expressions characterizing the HME network and/or the expressions optimizing the objective function have been pre-stored in the electronic device. When the observation data need to be evaluated actually, the specific model and expression to be actually used may only need to be determined based on the pre-stored prediction model and the optimization objective function according to the condition of the observation data. For example, the hierarchy of the prediction model to be used, the number of expert nodes, the number of gated nodes, and the respective parameters to be used in the optimization objective function may be determined, and then the values of these parameters are given to the following expression.

HME network model

In the following, some parameters that need to be used in the HME network are first defined to characterize the HME network.

For an HME network, the number of gating nodes may be indicated by G, and the number of expert nodes therein may be indicated by E. A binary dependent variable Ug e {0, 1} may be defined for the G-th (G ═ 1., G) of the gated nodes. The hidden variable Ug may indicate whether an observed data is associated with/matched with an expert node on the left branch of the gated node, i.e., whether the evaluation of the operation result of the data is generated by an expert node on the left branch of the gated node. If Ug is 1, the piece of observation data is associated with a certain expert node on the left branch of the gated node; and Ug-0 means that the piece of observation data is not associated with a certain expert node on the left branch of the gated node, i.e., is associated with an expert node on the right branch thereof.

X may be used to indicate that for each individual feature in the observed data, the probability that the result of the operation is generated by one expert node on the left branch of the gated node may be expressed as:

wherein theta is_g＝(α_g，β_g，γ_g) Indicating gating node g for segmentation dimension γ_gAt a predetermined division point beta_gA probability of performing a segmentation; gamma ray_gIndicating the segmentation dimension, i.e. according to which of the covariates X the segmentation is performed; beta is a_gIndicating a segmentation point, i.e. the segmentation is done at that value; alpha is alpha_gIndicating the probability of segmentation in this manner.

On this basis, the condition distribution of the e-th expert can be expressed as:

wherein

Where the parameter τ E indicates the average causal effect in expert E (E0, 1 … E) on the operation result Y for treatment T, and we and σ E2 indicate the parameters used by the expert model. The expert model in this example is a linear predictive model, but this is merely exemplary and the disclosure is not so limited.

In addition, some additional parameters may be further defined to facilitate the characterization of the later HME model. The parameter G can be further defined_gAnd ε^eWherein G is_gAn index indicating all expert nodes located only on the subtree of the G-th gating node, and G1., G; epsilon^eAn index indicating all gated nodes on a unique path starting from the root node to the e-th expert node, and e ═ 1; ...; E. based on this, the following function can be further defined:

where h (ξ, g, e) is a function defined according to whether the e-th expert node is on the left/right subtree of the gating node g; ξ is a broadly meaningful quantity that can be a hidden variable, but also other variables or functions, or a specific number.

Hidden variables Ug and b (x, theta) are indicated by xi, respectively_g) (equation 1), the following two functions can be obtained:

h_U(g，e)：＝h(U_gg, e) belongs to {0, 1}, which is a binary hidden variable;

h_b(x，g，e)：＝h{b(x，θ_g) G, e), which is a probability function.

Wherein h is_U(g, e) indicating whether an observed data is associated withThe e-th expert node under the g-th gating node branch is related and associated;

h_b(x, g, e) indicates the probability that an observation is associated with the e-th expert node under the branch of the g-th gating node.

Thus, the initial hybrid expert model can be expressed as:

wherein

y indicates a result variable T;

x indicates the respective dimension in the covariate X;

t represents a treatment variable;

θ＝(θ₁，...，θ_G) And indicates the relevant parameters of each gated node in the HME model,

φ＝(φ₁，...，φ_E) And indicates the relevant parameters of each expert node in the HME model.

Latent variable introduction

For the HME model described above, in order to perform subgroup analysis on the observation data, it is necessary to know whether the observation data matches each of the expert nodes in addition to the observation variables described above. Therefore, a binary hidden variable Z is further defined for the HME network_e. The hidden variable Z_eIndicating whether each of the observations matches a respective expert node in the initial prediction model. Specifically, whether the operation of the expert node e and each piece of observation data match or not can be represented as:

wherein

E indicates the index of the transit node, E ═ 0,1 … E;

g indicates an index of the gated node, G is 0,1, … G;

ε^eindication ofThe indices of all gated nodes on the unique path from the root node to the e-th expert node,

h_U(g, e) indicating whether an observation is associated with the e-th expert node under the g-th gated node branch, wherein a value of 1 indicates that it is associated with the e-th expert node under the g-th gated node branch, and a value of 0 indicates that it is not associated with the e-th expert node.

If the operation result of one piece of observation data can be matched with the expert node e, then Ze is equal to 1, otherwise Ze is equal to 0. The information that the observed data is matched with the E expert nodes can be represented by Z-Ze, and the hidden variable is a component assignment vector. Thus, the observation data and latent variables may be represented as:

and (3) observing data:

latent variable:

wherein z is_n＝(z_n1，...，z_nE) And is and

based on the above observations and hidden variables, the HME network in equation 4 above may be further represented as:

next, an optimization objective function of the initial prediction model may be established based on the above mentioned HME function with implicit variables introduced. Hereinafter, the FAB method will be taken as an example to construct the optimization objective function.

Optimizing objective function constructs

First, the parameters θ and φ required by the HME model in equation 4 above can be expressed by M. In this way, a model that maximizes the posterior of the following models can be selected based on a Bayesian approach:

p(M|y^N，x^N，t^N)∝p(M|x^N，t^N)p(y^N|x^N，t^N，M)

(formula 7)

Using a uniform model prior p (M | x)^N，t^N) Of particular interest is p (y)^N|x^N，t^NM). If q (z) is used^N) Indication z^NDistribution of variables of (2), then

And when q (z)^N)＝p(z^N|y^N，x^N，t^NAnd M), quality can be maintained. P (y) can be estimated by estimating and optimizing the lower right limit of the above equation^N|x^N，t^N，M)。

According to the Bayesian method, p (y)^N|x^N，t^NM) can be further represented as

Wherein

Other parameters are the same as above; and thus the number of valid samples for that amount is

Can be further used

D_g，D_eTo indicate Λ and θ, respectively_gAnd phi_eMaximum likelihood estimation of (1). The Laplace approximation method is then applied to the distribution of each decomposition

Thus, the device is provided with

The approximation can be made as follows:

wherein [ A, a]Indicating the quadratic term of matrix A and vector a

Aa, and

in the above-mentioned formula, the compound of formula,

indicating a fisher information matrix

The decomposition is approximated, wherein

Can be expressed as:

in a similar manner to that described above,

can be close toThe method is as follows:

wherein

Wherein

Indicating a fisher information matrix

The decomposition is approximated, wherein

Can be expressed as

Thus, logp (y)^N，z^N|x^N，t^Nθ, φ) can be approximated as:

this can be obtained by bringing formula 13 above into formula 9:

further, priorsp (θ)_g|M)，p(φ_e| M) is considered a constant, and note that:

thus, p (y)^N，z^N|x^N，t^NM) may be further expressed as:

by further ignoring the progressive min terms, we get the following optimization objective function FIC:

wherein the content of the first and second substances,

the following were used:

in the above-mentioned formula, the compound of formula,

it is practically difficult to obtain and therefore it is difficult to directly estimate the FIC. Whereas the FAB inference is performed on the asymptotic consistent lower bound of the objective function FIC. And for the function of laplace, for example,

it is therefore apparent that

Thus, according to

Can be given the following formula

Thus, FIC (y) can be obtained^N，x^N，t^NLower limit of M):

wherein

Is an arbitrary scalar. In this way, the initial prediction network can be optimized by the following maximization problem:

if q, theta and phi are fixed, then

The optimization objective function can be further reduced to.

Wherein S ═ S₁，...，S_E) Is a component function term vector.

In this way, an optimization objective function suitable for optimization for the HME model described above can be obtained. And solving the objective function to obtain the optimized prediction network.

The solution process of the optimization objective function can include two, the first is based on reduction of the structure of the initial prediction network by optimizing hidden variables; the second is to optimize parameters of the gate control node and the expert node based on the structure of the reduced prediction network. The above two processes will be described below by way of example.

Network architecture reduction

In one embodiment of the present disclosure, the network structure will be reduced by optimizing the probability distribution of hidden variables.

First, based on the above optimization objective function, the optimization probability q characterizing the expert node matching with the observation data can be expressed as:

here, the number of the first and second electrodes,

by aligning the above

Relative to q_neTaking the derivative and making the lead-in 0, the following equation can be obtained:

wherein

Thus, the optimized probability q of matching the expert node with the observation data can be obtained. Further, according to the matching probability of the optimized expert node and the observation data, the experts with low matching degree probability (namely, poor effectiveness) in the initial prediction model and the corresponding branches are removed. The manner of adoption is shown in the following equation, for example.

Wherein δ indicates a judgment threshold for judging whether to remove the expert node; and is

Indicating a normalization constant, wherein

Thus, through the model reduction or contraction step, the unneeded expert nodes and associated branches in the network model can be removed. Thus, through the above steps, irrelevant expert nodes can be removed from the initial prediction model having a symmetrical tree structure with a large number of expert nodes.

Model node parameter determination

After the irrelevant expert nodes are removed, the parameters for the nodes in the resulting model are optimized. In the embodiment according to, the component function term vector S and the parameters θ, φ will be further optimized. Based on the FIC objective equation, the optimization objective equation can be expressed as follows:

in the above

If there is no cross term between (S, Φ) and θ, that is, there is no correlation between the two, the two can be optimized separately, so that the following two optimization target equations can be obtained separately.

Wherein formula 24a can be further represented as:

it is composed of

Is of the gamma_gThe value of the dimension is greater than or beta_gA set of observed data of (a);

is of the gamma_gThe value of the dimension is less than beta_gA set of observed data of (a);

indexes indicating all expert nodes located on a subtree only to the left of the g-th gating node,

indexes indicating all expert nodes located only on the subtree to the right of the g-th gating node.

Thus, given

The solution of the above equation is relative to alpha_gAn analytical solution of, i.e.

And in formula 24b, given S_eIn the case of (2), De indicates

By using phi_eCan be represented as D in the L0 paradigm of De_e＝||ω_e||₀+2. In this way, the equation can be converted into a feature selection problem with discrete constraints, which can be expressed as follows.

Where C is a regularization term and

a corresponding predetermined constant. In equation 25 above, the objective function is based on φ_eSo the L0-regularized feature selection problem can be solved by using a forward-backward (FoBa) greedy algorithm.

It can be seen that the L0 paradigm is used in the optimization of parameters for gated and expert nodes, and thus the overfitting problem can be reduced.

The above steps of model structure reduction and model node parameter optimization may be iteratively performed in an iterative manner until the function converges. I.e. each time further optimization will be based on the current predicted network. For example, the optimization is performed on the basis of an initial predicted network for the first time, and the predicted network obtained by the above sub-optimization is further optimized on the basis of the subsequent optimization process. The iteration may be repeated until two consecutive iterations result in a difference between the predicted networks that is below a predetermined threshold. Of course, it may be provided that the iteration is stopped when a predetermined number of iterations is exceeded, or a combination of both.

Through such a number of iterations, a final predictive model may be obtained in which the individual features associated with each gated node and its segmentation points, as well as the expert model to be used in the predictive model and its parameters, have been determined.

It should be noted that the whole process of creating the HME network is described in detail above, but in practical application, the creation of the objective function is not performed step by step as in the above process, but rather, parameters M related to the HME network expression in equation 4, for example, may be determined, and then the prediction model is reduced based on equations 21 and 22, for example, by using these parameters and observation data, and the parameter optimization values of the gating node and the expert node are solved based on equations 26 and 27, for example.

In the above, a technical solution is described in which an optimized prediction network is obtained based on observation data, and the operation results are evaluated for each subgroup by using expert nodes in the network. However, the technical solution of the present disclosure may also be implemented in other ways, and the operation effect prediction is performed for the individual by using the final prediction model determined in step 120, for example. This will be explained below with reference to fig. 4.

FIG. 4 schematically illustrates a flow chart of a method for evaluating the results of a predetermined operation on one or more individuals, according to one embodiment of the present disclosure. The steps of the method may be performed collectively by a single processing unit in the electronic device, may be performed individually by a plurality of processing units in the electronic device, and may be performed by a plurality of processing units in a plurality of electronic devices, as long as data transmission is possible therebetween.

As shown in FIG. 4, at block 410, individual characteristics of one or more individuals and corresponding ones of predetermined operations performed with respect to the one or more individuals are received. An individual here is, for example, a patient, a student or another individual, and the individual characteristic of the individual is an individual attribute related to the individual that may be related to the result of the operation. In one example, it includes, for example, the patient's weight, age, sex, time of illness, etc.; in another example, it includes, for example, the student's age, gender, current level (e.g., english level), whether or not similar classes were attended, family economics, and the like.

Then, at block 420, one or more sub-groups to which the one or more individuals belong are determined based on a prediction model and individual characteristics of the one or more individuals, the prediction model having a hierarchical structure and comprising a gating node for grouping the individuals and a plurality of different expert nodes for performing predictions based on different prediction methods. Gated nodes are included in the predictive model, which indicate the corresponding individual features and their segmentation points, based on which the individuals can be divided into corresponding subgroups.

In one embodiment, the predictive model may be a pre-established optimization model, which may be constructed by the following operations. For example, an initial prediction model is first established for a set of observation data, wherein the initial prediction model has a hierarchical structure and includes a gating node for grouping individuals and a plurality of different expert nodes for performing prediction based on different prediction methods, and the observation data includes a plurality of individual features of a plurality of individuals, respective ones of the predetermined operations performed for the plurality of individuals, and respective operation results. Parameters of gated nodes and expert nodes of the initial predictive model are then determined using the observation data to obtain a final predictive model. The process of predetermining the predictive model is similar to that described above in connection with

blocks

110 and 120 of fig. 1. For detailed information thereof, reference may be made to the description made in conjunction with fig. 1-3, which will not be described herein again.

Next, at block 430, respective operational outcomes are predicted for the one or more individuals according to the respective operational modes using expert nodes of the predictive model associated with the determined one or more sub-groups. The predicted outcome of the operation may be, for example, a value indicative of a possible effect of the patient taking the medication, or a plurality of effect values at a plurality of different doses of the medication with respect to the patient, or an effect profile with respect to successive doses of the medication with respect to the patient.

FIG. 5 schematically shows a block diagram of a system for evaluating operational results according to one embodiment of the present disclosure. As shown in fig. 1, the system includes an observation database 501, a prediction model construction unit 510, a model optimization unit 520, and an operation result estimation unit 530.

In the observation database 501, observation data such as individual characteristics of the patient, corresponding operations, and corresponding operation effects are stored. These data may be preprocessed in advance, such as integration of raw data, specification, noise reduction, and so on.

The prediction model construction unit 510 receives the observation data and constructs an initial prediction model for the observation data. The predictive model may be the aforementioned HME model, which has a tree-like hierarchy and mixes multiple expert models in terms of gated empty nodes.

The model optimization unit 520 optimizes the initial network using an objective optimization function based on the observed variables. Optionally, the model optimization estimation unit may include an objective function determination unit 522, a hidden variable optimization unit 524, a model reduction unit 526, and a node optimization unit 528.

The objective function determination unit 522 may determine an optimized objective function for the initial prediction model based on, for example, the FAB. For example, the above-mentioned variable distribution q can be used to obtain a progressive approximation of the edge log-likelihood, i.e., the above FIC function. The lower bound of the FIC of the HME model is then taken as the optimization objective function of the FAB algorithm.

The hidden variable optimization unit 524 may optimize the variable distribution q based on the above optimization objective function. The model reduction unit 526 may remove the expert nodes and their related branches that have a poor matching degree with the observed data according to the result of the above variable distribution optimization.

The node optimization unit 528 optimizes parameters of the gate-controlled node and the expert node in the reduced HME model based on the fir tree optimization objective function.

The model optimization unit will make the hidden variable optimization unit 524, the model reduction unit 526 and the node optimization unit 528 perform iterative operations until FIC converges, thereby obtaining a final prediction model.

The operation result estimation unit 530 obtains the prediction results of the expert nodes in the final prediction model for predicting the observation data of each associated subgroup.

The operation result estimation unit 530 may be a unit that is separate from the above-described unit, and may be located in the same electronic device as the above-described units or in another electronic device in which the final prediction model is stored. It may receive data relating to the individual, such as individual characteristics of the individual, an operation to be evaluated, etc. In this way, the sub-group to which the individual belongs and the expert model corresponding thereto may be determined based on the final prediction model, and the operation result prediction may be performed for the individual based on the expert model, as described in conjunction with fig. 5. In addition, it should be noted that the operation result data may also receive batch of individual data to be evaluated, and give corresponding operation result evaluation for each individual.

In addition, it should be noted that the system may be implemented by one electronic device, or may be implemented by a plurality of electronic devices; each of the functional blocks in the drawings may be implemented by one processing unit, or may be implemented by a plurality of processing units in one or a plurality of electronic devices.

Hereinafter, implementations of the disclosed solution will be described below in conjunction with some specific embodiments. It should be noted, however, that these are merely exemplary and the present disclosure is not limited thereto.

In the medical field, it is often necessary to know the effectiveness of a drug for different types of patients in order to subsequently improve the formulation of the drug or to facilitate the decision by the physician or patient whether the drug is suitable or the appropriate dosage. In existing solutions, the analysis and selection is usually performed by a professional based on his own experience. And even if some verification subgroup analysis methods exist, the accuracy of an analysis result depends on subjective judgment of professionals because the verification subgroup analysis methods are sensitive to the hyper-parameters. With the method of the present disclosure, a final prediction network may be automatically determined based on the observed data and analyzed to obtain an assessment of the effect to be achieved for different types of patients.

FIG. 6 schematically illustrates a diagram for evaluating results of operations of a predetermined operation on a subset of individuals, according to one particular implementation of the present disclosure. The observation data, including medical-related data of 198 children, are shown as input data in fig. 6. Each data indicates the bone density effect of a certain calcium or placebo on each child. As shown, each observation includes 7 variables, a treatment variable T, a result variable Y, and 5 covariates X1-X5. Treatment T is a binary treatment variable indicating whether the patient is taking calcium or placebo. The resulting variable is a continuous output of the total body bone density value TBBMD. The 5 covariates indicate 5 characteristics of the patient, e.g., X1 is age (6-18 years), X2 is gender (male or female), X3 is race (white, African American, others), X4 is Down's stage (TS; in order the variables take values 1-6); x5 is the length of illness (2-9 years).

Inputting data of 198 children into a processing unit in the electronic device, for example, a 6-layer HME model with 32 expert nodes can be built for the 198 pieces of data, then the model is reduced based on the FAB model according to, for example, equations 19 and 20, and optimization parameters of the expert nodes and the gated nodes in the model are determined using equations 24 and 25. Finally, a final network may be output as shown in the lower right side of fig. 6, where the number of observation data (i.e. individual subgroups) associated with each expert node is shown on the left or right side of the node, while the values indicative of the treatment effect calculated for the respective observation data with each expert network are shown below the node. The evaluation values are shown as the average of all individuals within the subgroup predicted based on the expert model.

As can be seen from the above results, the calcium agent has an effect on children with a duration of illness below 6.5 years and an age of less than 15 years, but the effect on boys is superior to that on girls. In addition, the calcium agent is not effective for patients with an illness time of more than 6.5 years or patients with an age of more than 15 years.

With embodiments of the present disclosure, an assessment of the effect of the multiplexed calcium agents on the respective sub-groups may be output, based on which, for example, adjustments to subsequent agent formulations may be made. Additionally, the physician or patient can determine based thereon whether it is appropriate to reuse the calcium agent to assist them in selecting the appropriate medication. Additionally or alternatively, the patient data may be entered directly into the electronic device, there being an electronic device that automatically determines which subgroup it belongs to and the corresponding expert model based on the individual characteristics of the patient, and uses the expert model to determine the effect of the calcium agent on the regimen. Where there are multiple drug-related models in the model that are associated with one disease, it is also possible to automatically select a more appropriate and more effective drug for the patient.

Further, in the field of education, a student may need to select from among a plurality of similar courses that are arranged differently in detail (for example, english courses that differ in the ratio of listening, speaking, reading, and writing), or an education institution may need to recommend a more appropriate course to the student. Educational institutions often have observed data in this regard. Observation data may include characteristics of students who previously attended class D, E, F, etc., such as age, gender, whether they attended similar classes, family economics, etc., and individual student performance after attending the corresponding class, such as test scores, winning awards, etc.

Based on such observation data, the method of the present invention can be utilized to obtain different learning schemes applicable to different types of students, and based thereon, can help the student currently making course selection to make a decision or recommend a more appropriate course thereto. Similar to the aforementioned, the conventional method cannot accurately and objectively recommend a course to a student. By utilizing the scheme provided by the disclosure, because the scheme is insensitive to the hyper-parameters, a more accurate evaluation result can be obtained

Further, it is noted that the plurality of expert models employed in the predictive model of the present disclosure may be applied to various types of data, such as discrete binary processing, ternary processing, or more, while also being suitable for continuous processing.

Fig. 7 illustrates a schematic block diagram of an example device 700 that may be used to implement embodiments of the present disclosure. As shown, device 700 includes a Central Processing Unit (CPU)701 that may perform various appropriate actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM)702 or computer program instructions loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processing unit 701 performs the various methods and processes described above, such as any of the processes 100, 300, and 400. For example, in some embodiments, processes 100, 300, and 400 may be implemented as a computer software program or computer program product that is tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into RAM 703 and executed by CPU 701, one or more steps of any of processes 100, 300, and 400 described above may be performed. Alternatively, in other embodiments, CPU 701 may be configured to perform any of processes 100, 300, and 400 in any other suitable manner (e.g., by way of firmware).

According to some embodiments of the present disclosure, a computer-readable medium is provided, on which a computer program is stored, which program, when executed by a processor, implements a method according to the present disclosure.

It will be appreciated by those skilled in the art that the steps of the method of the present disclosure described above may be implemented by a general purpose computing device, centralized on a single computing device or distributed over a network of computing devices, or alternatively, may be implemented by program code executable by a computing device, such that the program code may be stored in a memory device and executed by a computing device, or may be implemented by individual or multiple modules or steps of the program code as a single integrated circuit module. As such, the present disclosure is not limited to any specific combination of hardware and software.

It should be understood that although several means or sub-means of the apparatus have been referred to in the detailed description above, such division is exemplary only and not mandatory. Indeed, the features and functions of two or more of the devices described above may be embodied in one device in accordance with embodiments of the present disclosure. Conversely, the features and functions of one apparatus described above may be further divided into embodiments by a plurality of apparatuses.

The above description is intended only as an alternative embodiment of the present disclosure and is not intended to limit the present disclosure, which may be modified and varied by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A method of evaluating results of operations, comprising:

establishing an initial prediction model for a set of observation data, wherein the initial prediction model has a hierarchical structure and comprises a gating node for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods, and wherein the observation data comprises a plurality of individual features of a plurality of individuals, respective ones of the predetermined operations performed for the plurality of individuals, and respective operation results;

determining parameters of a gating node and an expert node of the initial prediction model by using the observation data to obtain a final prediction model; and

and predicting the individual subgroups matched with the expert nodes in the final prediction model to determine operation results of the predetermined operation on the individual subgroups.

2. The method of claim 1, wherein determining parameters of a gated node and an expert node of the initial predictive model using the observation data comprises:

an optimization objective function for the initial prediction model is determined, and the initial prediction model is optimized based on the observation data and the optimization objective function to determine the final prediction model.

3. The method of claim 1, wherein determining an optimization objective function for the initial predictive model comprises:

an optimization objective function for the initial prediction model is determined based on a decomposition progressive Bayesian method.

4. The method of claim 3, wherein the determining an optimized objective function for the initial predictive model based on a decomposition progressive Bayesian method further comprises:

introducing hidden variables into the optimized objective function determined based on the decomposition progressive Bayesian method, wherein the hidden variables indicate whether each piece of observation data is matched with each expert node in a prediction model.

5. The method of claim 4, wherein optimizing the initial predictive model based on the observation data and the optimization objective function comprises:

optimizing the hidden variables based on the optimization objective function by using the observation data to determine the optimization probability of the observation data matching each expert node;

reducing the existing prediction model based on the determined optimization probability to obtain a simplified prediction model;

and optimizing the gate control nodes and the expert nodes in the simplified prediction model based on the optimization objective function to determine the optimization parameter values of the gate control nodes and the optimization parameter values of the expert nodes, so as to generate an optimized prediction model.

6. The method of claim 5, wherein the operations of determining the optimized probability distribution, curtailment of a current predictive model, and optimization of gating nodes and expert nodes are repeated to determine the final predictive model that can be used to make predictions.

7. The method of claim 1, wherein the initial predictive network is a hierarchical hybrid expert network model.

8. The method of claim 1, wherein the predetermined operation comprises operation of at least three different treatment levels.

9. The method of claim 1, wherein the predetermined operation comprises an operation based on a continuous operation treatment level.

10. A method for evaluating results of operations, comprising:

receiving individual characteristics of one or more individuals and corresponding operation modes in predetermined operations performed on the one or more individuals;

determining one or more subgroups to which the one or more individuals belong based on a prediction model and individual characteristics of the one or more individuals, the prediction model having a hierarchical structure and comprising a gating node for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods; and

according to the corresponding operation mode, utilizing expert nodes in the prediction model associated with the determined one or more subgroups to predict corresponding operation results for the one or more individuals.

11. The method of claim 10, wherein the predictive model is predetermined by:

establishing an initial prediction model for a set of observation data, wherein the initial prediction model has a hierarchical structure and comprises a gating node for grouping individuals and a plurality of different expert nodes for performing predictions based on different prediction methods, the observation data comprising a plurality of individual features of a plurality of individuals, respective ones of the predetermined operations performed for the plurality of individuals, and respective operation results;

and determining parameters of a gating node and an expert node of the initial prediction model by using the observation data to obtain a final prediction model.

12. An electronic device, comprising:

a processor; and

a memory coupled with the processor, the memory having instructions stored therein that, when executed by the processor, cause the electronic device to perform acts comprising:

13. The electronic device of claim 12, utilizing the observation data to determine parameters of a gated node and an expert node of the initial predictive model comprises:

an optimization objective function for the initial prediction model is determined, and the initial prediction model is optimized based on the observation data and the optimization objective function to the final prediction model.

14. The electronic device of claim 13, wherein determining an optimization objective function for the initial predictive model comprises:

15. The electronic device of claim 14, wherein the determining an optimized objective function for the initial predictive model based on a decomposition progressive bayesian method further comprises:

16. The electronic device of claim 15, wherein optimizing the initial predictive model based on the observation data and the optimization objective function comprises:

optimizing the hidden variables based on the optimization objective function to determine the optimization probability of the observation data matching with each expert node;

17. The electronic device of claim 14, wherein the operations of determining the optimized probability distribution, curtailment of a current predictive model, and optimization of gating nodes and expert nodes are repeated to determine the final predictive model that can be used to make predictions.

18. The electronic device of claim 11, wherein the initial predictive network is a hierarchical hybrid expert network model.

19. The electronic device of claim 11, wherein the predetermined operation comprises operations of at least three different levels of treatment.

20. The electronic device of claim 11, wherein the predetermined operation comprises an operation based on a continuous operation treatment level.

21. An electronic device, comprising:

a processor; and

22. The electronic device of claim 19, wherein the optimized predictive model is predetermined by:

23. A computer program product tangibly stored on a computer-readable medium and comprising machine executable instructions that, when executed, cause a machine to perform the method of any of claims 1 to 9.

24. A computer program product tangibly stored on a computer-readable medium and comprising machine executable instructions that, when executed, cause a machine to perform the method of any of claims 10 to 11.