CN109300549B - Food-disease association prediction method based on disease weighting and food category constraint - Google Patents

Food-disease association prediction method based on disease weighting and food category constraint Download PDF

Info

Publication number
CN109300549B
CN109300549B CN201811180791.5A CN201811180791A CN109300549B CN 109300549 B CN109300549 B CN 109300549B CN 201811180791 A CN201811180791 A CN 201811180791A CN 109300549 B CN109300549 B CN 109300549B
Authority
CN
China
Prior art keywords
food
disease
group
matrix
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811180791.5A
Other languages
Chinese (zh)
Other versions
CN109300549A (en
Inventor
王嫄
张耀功
陈赠光
王靖寰
杨巨成
赵青
陈亚瑞
孔娜
王洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Science and Technology
Original Assignee
Tianjin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Science and Technology filed Critical Tianjin University of Science and Technology
Priority to CN201811180791.5A priority Critical patent/CN109300549B/en
Publication of CN109300549A publication Critical patent/CN109300549A/en
Application granted granted Critical
Publication of CN109300549B publication Critical patent/CN109300549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/60ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to nutrition control, e.g. diets

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Data Mining & Analysis (AREA)
  • Nutrition Science (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention relates to a food-disease association prediction method based on disease weighting and food category constraint, which comprises the following steps: constructing a disease weighting relation by using international disease classification data; constructing a food similarity network by using the ingredient list; constructing a food group relationship using a food classification system; constructing a known binary food-disease association network; randomly initializing the representation of food and disease in the underlying space; introducing a disease weighting relationship and a food group relationship, and learning the representation of the food and the disease latent space; and outputting the correlation result of the predicted food and the disease by using the representation of the food and the disease potential space. The method has reasonable design, overcomes the problem of sparseness of food disease associated data, improves the accuracy of a food and disease associated prediction model, simultaneously leads the computation time complexity of the model to be in a linear relation with the number of foods in a food group, reduces the computation complexity and reduces the consumption of computation resources.

Description

Food-disease association prediction method based on disease weighting and food category constraint
Technical Field
The invention belongs to the technical field of food safety, and particularly relates to a food-disease association prediction method based on disease weighting and food category constraint.
Background
With the improvement of the consumption capacity of residents and the enhancement of health consciousness, people no longer meet the life needs of basic substances, and the requirements on life quality and healthy life are higher and higher. Of these, most typically, there is an increasing demand for healthy dietary guidelines. It has been proved that diet has a close relationship with the occurrence and development of diseases, and the relationship is usually surprising and has profound influence, for example, diet mainly based on animal food can cause the occurrence of chronic diseases (such as obesity, coronary heart disease, tumor, osteoporosis, etc.); diets based on vegetable foods are most beneficial for health and most effective in preventing and controlling chronic diseases.
To study the above relationships, statistical analysis is usually performed by taking local demographic samples, questionnaires, dictations, or in vivo studies to obtain relevant data. However, the correlation acquisition method needs to consume huge manpower and material resources, especially living experiments with high confidence coefficient have huge risk, and it is difficult to satisfy the informed demand of detailed food-disease correlation of people. The typical risk mainly lies in the filling of the error information of the questionnaire by the questionnaire, the biased statistics of the indexes in the questionnaire, the comprehensive action of various factors of the respondents, and is not a single food variable factor. The handling of experimenters in vivo experiments is also one source of risk. Meanwhile, with the rapid growth of food types, the cost of experiments and investigation exponentially increases, and due to the limitation of manpower and material resources, the fact research cannot be updated in time, and only can be focused on a few diseases and a few food categories. Further, the fine-grained relationship between the amount of food and disease, the interaction of eating methods, is not clear, the global statistics is extremely difficult due to the huge number of variables, and the analysis of the amount of fine-grained and eating methods is an important aspect of specifically causing diseases.
In summary, the association of food with disease is a currently topical area of concern. At present, a prediction method with high confidence and guiding significance does not appear in the association of a wide range of foods and diseases. How to provide research guidelines for the association research of diseases and foods, narrow the investigation range and reduce the consumption of a large amount of manpower and resources caused by random tests is a problem which needs to be solved urgently at present.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a food-disease association prediction method based on disease weighting and food category constraint, which combines the hierarchical relationship of the categories of diseases with the action of food groups through a computer prediction method of the relevance of food and diseases, enhances the robustness of food disease association prediction and overcomes the problem of sparseness of food disease association data.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
a food-disease association prediction method based on disease weighting and food category constraint comprises the following steps:
step 1, constructing a disease weighting relation by using international disease classification data;
step 2, constructing a food similarity network by using the ingredient table;
step 3, constructing a food group relation by using a food classification system;
step 4, constructing a known binary food-disease association network;
step 5, randomly initializing the representation of the food and the diseases in the potential space;
step 6, introducing a disease weighting relation and a food group relation, and learning the representation of the potential space of the food and the disease;
and 7, outputting the correlation result of the predicted food and the disease by using the representation of the food and the disease potential space.
Further, the specific implementation method of step 1 is as follows:
first, a disease correlation matrix S is constructed using international disease classification data1Setting an element S in a disease relevancy matrix if the expressions of the two diseases are in a parent-child relationship in the international disease classificationij1, otherwise Sij0, wherein disease i is the parent list of disease j, which is a specific subclass of disease i;
then, define the depth of the father node i(i,j)And the weight C (depth) of the depth of the edge formed by the parent node i and the child node j(i,j)) The definition is as follows:
Figure BDA0001822562430000021
C(depth(i,j))=1+log(depth(i,j))
finally, weighted based on the hierarchy, the matrix of relevance of the disease is represented as follows:
(S′1)ij=(S1)ij*C(depth(i,j))。
further, the specific implementation method of step 2 is as follows: in the food similarity network, each node is a combination of 'food-quantity-eating method'; under the condition that the 'measuring-eating method' is different, the relation of every two nodes is set as 0; under the condition that the 'quantity-use method' is the same, calculating the similarity between every two foods by using a cosine formula according to a food ingredient table to be used as a node relation value to obtain a food similarity network S2
Further, the specific implementation method of step 3 is as follows: as a relation of foods according to a food classification system stipulated by the country and using 20 classes specifically classified; food with the same quantity and the same eating method and classification is divided into a group, namely each element is a triad of 'food name-quantity-eating method'; the food is classified into different food groups according to different food properties and component ratios.
Further, the specific implementation method of step 4 is as follows: combining known food-disease associations with a binary matrix R(n×m)The expression is that the 'food name-quantity-edible method' is used as a refinement item of the food, modeling is carried out by using the 'food name-quantity-edible method-disease', the four-tuple of the verified association is set to be 1, and otherwise, the four-tuple is 0, wherein n 'food name-quantity-edible methods' are arranged in a matrix, and m diseases are listed.
Further, the specific implementation method of step 5 is as follows: random initialization of food and disease representation in underlying space Rn×KAnd VK×m: initialization is done by assigning any number between 0-1 to each value in the two matrices.
Further, the specific implementation method of step 6 is as follows:
decomposing the food-disease association matrix R into the product of the food vector U and the disease vector V, the decomposition objective function is defined as:
Figure BDA0001822562430000022
defining a hierarchical relationship after disease weighting, limiting two diseases with adjacent parent-child relationships to a potential space to keep a relatively close distance:
Figure BDA0001822562430000031
wherein tr (-) represents trace, S ', of the matrix corresponding to the parenthesis'1Is a symmetry matrix of the disease; diagonal matrix (D'1)ii=∑j(S′1)ijGraph Laplacian L1=D'1-S′1,||A||2Is L of the A matrix2A regularization value; v.i、V.jColumn vectors of ith and jth columns in the V matrix; a. theTRefers to the transposition of the A matrix;
applying the common graph laplacian operator to the food similarity:
Figure BDA0001822562430000032
S2is a network of food similarity (D)2)ii=∑j(S2)ij,D2Is a diagonal matrix and the elements on the diagonal are S2A row of2=D2-S2
Introducing a food group relationship, taking the geometric center point of all foods in the potential space as a group center point, and all group members in the group should be close to the group center point; in each iteration, the center point of each group is calculated using the U and V used in the last iteration that has occurred, these points being used as fixed variables in the current iteration; the group-centered constraint is expressed as follows:
Figure BDA0001822562430000033
wherein
Figure BDA0001822562430000034
Is the jth element in food group G,
Figure BDA0001822562430000035
is the geometric center of food group G;
Figure BDA0001822562430000036
representing Euclidean distance between a member j in a group G and the center point of the group in which the member j is located; r is to be0、R1、R2Merging into underlying matrix factorization targets
Figure BDA0001822562430000037
In (3), the objective function is obtained as follows:
Figure BDA0001822562430000038
wherein λ0、λ1And λ2For a specified parameter, a person is selected with a value range of: lambda [ alpha ]0And λ1Selected from the set {0, 0.001, 0.01, 0.1, 1, 10, 100, 1000}, λ2And selecting from the set {1, 10, 100, 1000}, and solving to obtain representations U and V of potential spaces of food and diseases by using a gradient descent method.
Further, the specific implementation method of step 7 is as follows: and performing dot multiplication on the ith row in the representation U of the food potential space and the jth column in the representation V of the disease potential space to obtain a possible relation value between the food-quantity-edible method i and the disease j.
The invention has the advantages and positive effects that:
1. according to the invention, under a matrix decomposition framework, the hierarchical relationship of disease classification and the group relationship of food categories are considered, and a weighting strategy and a group center strategy are applied, namely, the disease weighting relationship is calculated according to the disease classification hierarchy and the food group is constructed by utilizing food classification information, so that the food group is used as the prior constraint of the modeling of the association of diseases and foods, the problem of sparseness of food disease associated data is solved, the robustness of prediction is enhanced by using prior knowledge in an auxiliary manner, and the accuracy of a prediction model of the association of foods and diseases is improved. Meanwhile, the invention defines the group center concept, so that the model calculation time complexity and the number of foods in the food group are in a linear relation, the calculation complexity is reduced, and the consumption of calculation resources is reduced.
2. The invention combines the hierarchical relation of disease category with the action of food group, is helpful to identify new food-disease association, can further guide the research of healthy diet, and meanwhile, the potential space representation of food and disease of the invention can also be widely applied to other researches related to food and disease.
Drawings
FIG. 1 is an overall process flow diagram of the present invention;
fig. 2 is a flowchart of the algorithm of step 6 of the present invention.
Detailed Description
The embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The design idea of the invention is as follows: in the fields of nutrition and food safety, matrix decomposition in machine learning and semantic space theory and technology are utilized, food-disease association matrix decomposition is used as a basic framework, and two kinds of prior knowledge information, namely a weighting relation contained in a disease classification level and food group information, are introduced. The present invention does not consider food-borne diseases due to pathogenic agents and is based on the following two assumptions: (1) classifying diseases in the higher-level directory, and representing the more abstract meanings; the more specific the disease in the lower category it means. (2) The food is classified into different groups according to different properties and component ratios, the food groups reveal similar related information on the food level, and the nutrient substances and the content effects provided by the food on human bodies are related after the food is eaten. Therefore, the method provided by the invention designs the center of the group in the concrete solving process by constructing the loss function, thereby reducing the complexity of model calculation time, reducing the consumption of calculation resources and overcoming the problem of sparse food-disease association obtained by early exploration.
Based on the above design concept, the food-disease association prediction method of the present invention, as shown in fig. 1, includes the following steps:
step 1, constructing a disease weighting relation by using international disease classification data.
In this step, a disease correlation matrix S is constructed using the international disease classification data1: if the expression of two diseases is in a parent-child relationship in the international classification of diseases, SijFor example, if disease i is the parent of disease j and disease j is the specific subclass of disease i, S is setij1, otherwise Sij=0。
Further, considering the hierarchical tree structure of disease classification, general diseases are loosely related at a higher level, while specific diseases are more closely related at a deeper level. To better capture this feature in disease data, the present invention introduces the variable depth(i,j)And an auxiliary function C (depth)(i,j))。depth(i,j)Representing the depth of the parent node i, if the parent node i is the root node, defining depth(i,j)1, the following is defined:
Figure BDA0001822562430000041
C(depth(i,j)) Is the weight of the depth of the edge formed by the parent node i and the child node j. C (depth)(i,j)) The specific definition of (A) is as follows:
C(depth(i,j))=1+log(depth(i,j))
after weighting based on the hierarchy, the correlation matrix of the disease is represented as follows:
(S′1)ij=(S1)ij*C(depth(i,j)).
step 2: and constructing a food similarity network by using the ingredient list.
In the food similarity network, each node is a combination of "food-quantity-eating methods". First, each food item is expressed as a vector of ingredients, which means calories, foods, etc. per 100 grams of the edible part of the food itemThe values of dietary fibre, calcium, magnesium, iron, manganese, zinc, etc., i.e. each vitamin represents the amount of a component contained in 100 g of the food product. Secondly, under the condition that the 'quantity-eating method' is different, the relation of every two nodes is set as 0; under the condition that 'quantity-use method' is the same, a cosine formula is applied to a food vector to calculate the similarity between every two foods as a node relation value to obtain a food similarity network S2. The cosine formula used here is as follows:
Figure BDA0001822562430000051
wherein a and b are two food component vectors respectively.
And step 3: food item group relationships are constructed using a food item classification system.
In this step, 20 types specifically classified, including grains and products, edible oils, meats and products thereof, sterilized fresh milk, dairy products, aquatic products, cans, sugar, cold foods, beverages, distilled or prepared liquors, fermented liquors, seasonings, bean products, cakes, confectionery, pickles, health foods (according to "health food management method"), new resource foods (according to "new resource food management sanitation method"), and other foods are used as the relationship of foods according to the food classification system stipulated by the state. In the invention, the foods with the same amount and the same classification as the eating method are grouped, namely, each element is a triad of 'food-amount-eating method'.
The food is classified into different groups according to different properties and component ratios, one food group reveals similar related information at the food level, and the nutrient substances and the content effects provided by the food on human bodies are related after the food is eaten. Thus, one heuristic is that all foods in a group are functionally similar and may cause similar or identical diseases. One group center at a time.
In the present invention, it is not that any two food items in a group remain close in vector, since the computational complexity would then be a square multiple of the number of food items in a group. The present invention introduces a group centric conceptUsing the geometric center point of all the food items in a group in the potential space as the group center point
Figure BDA0001822562430000052
A potential spatial vector representing the jth element in food group G,
Figure BDA0001822562430000053
representing the geometric center of food group G. Then there is
Figure BDA0001822562430000054
Representing the euclidean distance between a member j in a group G and the center point of the group in which it is located.
And 4, step 4: a known binary food-disease association network is constructed.
In this step, the known food-disease association is represented by the binary matrix R(n×m)It is shown that, here, the association of food and disease is refined, that is, "food name-amount-eating method" is introduced as a refinement item of food, modeling is performed by using "food name-amount-eating method-disease", and the quadruple for the verified association is set to 1, otherwise 0. The rows in the matrix are n "food name-amount-eating method", and the columns are m diseases. The verification here mainly focuses on scientific research paper data. When the experimental results of the new and old papers are different, the journal paper with new published times and high influence factors is taken as the standard to certify whether the 'food-quantity-edible method' is related to 'diseases'.
And 5: the representation of food and disease in the underlying space is randomly initialized.
The specific implementation method of the step is as follows: random initialization of food and disease representation in underlying space Rn×KAnd VK×mI.e. by assigning any number between 0-1 to each value in the two matrices as initialization.
Step 6: introducing disease weighting relation and food group relation, and learning the representation of food and disease potential space. The specific implementation method of this step is shown in fig. 2.
On the basis of matrix decomposition, the method considers the following two constraints of external knowledge on food potential space representation modeling and disease potential space representation modeling: (1) classifying diseases in the higher-level directory, and representing the more abstract meanings; the more specific the disease in the lower category it means. (2) The food is classified into different groups according to different properties and component ratios, the food groups reveal similar related information on the food level, and the nutrient substances and the content effects provided by the food on human bodies are related after the food is eaten. The basic matrix decomposition modeling is first explained, and then the weighted relationship constraint and the food group constraint are gradually introduced.
According to the matrix decomposition method, the food-disease association matrix R is decomposed into the product of a food vector U and a disease vector V, and then a decomposition objective function is defined as:
Figure BDA0001822562430000061
by introducing a hierarchical relationship after disease weighting, the invention can limit two diseases with adjacent parent-child relationship to keep a closer distance in a potential space.
Figure BDA0001822562430000062
Where tr (-) denotes the trace of the matrix corresponding in parentheses, i.e., the sum of the elements on the main diagonal (diagonal from top left to bottom right) of the matrix. S'1Is a symmetry matrix of the disease. The invention defines a diagonal matrix
Figure BDA0001822562430000063
I.e. the value on the diagonal entry is S'1The row and column. Definition of L in the invention1=D'1-S′1I.e., the graph laplacian. | A | non-conducting phosphor2Is L of the A matrix2The regularization value, i.e., the sum of the squares of all elements in the A matrix that are not 0. V.i,V.jThe column vectors of the ith and jth columns in the V matrix. A. theTRefers to the transposition of the matrix A, namely the corner marks of the elements in the matrix A are exchanged front and back, and the other right corner marks areThe same applies to T.
Because there is no hierarchical structure between foods, the invention applies the common graph laplacian operator to the similarity of foods:
Figure BDA0001822562430000064
Figure BDA0001822562430000071
wherein S2See step 2 for definition. (D)2)ii=∑j(S2)ijI.e. D2Is a diagonal matrix and the elements on the diagonal are S2A row of2=D2-S2
Introducing the food group relationship, the present invention takes the geometric center point of all food items in the potential space as the group center point, and all group members in the group should be close to the group center point. In each iteration, the present invention computes the center point of each group using the U and V used in the last iteration that has occurred, which points are used as fixed variables in the current iteration. The group-centric constraint can be expressed as follows:
Figure BDA0001822562430000072
wherein
Figure BDA0001822562430000073
Is the jth element in food group G,
Figure BDA0001822562430000074
is the geometric center of food group G, as specifically defined in step 3.
Figure BDA0001822562430000075
Representing the euclidean distance between the member j in group G and the center point of the group in which it is located. R is to be0,R1,R2Merging into underlying matrix factorization targets
Figure BDA0001822562430000076
In (3), the complete objective function is obtained as follows:
Figure BDA0001822562430000077
wherein | A | purple2Is L of the A matrix2The regularization value, i.e., the sum of the squares of all elements in the A matrix that are not 0. Lambda [ alpha ]0And λ1Balancing the influence of group constraints and weighting relationships, λ2Model complexity is controlled to avoid overfitting. Lambda [ alpha ]0And λ1Selected from {0, 0.001, 0.01, 0.1, 1, 10, 100, 1000}, λ2From the set {1, 10, 100, 1000}, by lattice search, i.e. traversing all λ0,λ1And λ2Combine to find the best parameters of the model. From the gradient descent and Lagrangian method, the following is obtained for UikAnd VkjThe iterative update formula:
Figure BDA0001822562430000078
wherein Ψ ═ (Y)i·⊙(UV))(VT)·k
Figure BDA0001822562430000081
Wherein the content of the first and second substances,
Figure BDA0001822562430000082
a ⊙ B represents the multiplication of corresponding positions of the A and B matrixes.
According to O1The method for solving U and V is as follows:
(1)O′1←O1
(2) calculate each group center:
Figure BDA0001822562430000083
(3) by using information about UikAnd VkjRespectively updating U and V by the iterative updating formula;
(4) according to the formula O1Calculating a new objective function O1
(5) If | O'1-O1If the | is less than the epsilon, stopping circulation and outputting U and V; otherwise, using the ite ← ite +1, stopping circulation when the ite is more than or equal to Max _ ites, and outputting U and V; otherwise, repeating (1) - (4).
And 7: and outputting the correlation result of the predicted food and the disease by using the representation of the food and the disease potential space.
And performing dot multiplication on the ith row in the representation U of the food potential space and the jth column in the representation V of the disease potential space to obtain a possible relation value between the food-quantity-edible method i and the disease j.
It should be emphasized that the embodiments described herein are illustrative rather than restrictive, and thus the present invention is not limited to the embodiments described in the detailed description, but also includes other embodiments that can be derived from the technical solutions of the present invention by those skilled in the art.

Claims (7)

1. A food-disease association prediction method based on disease weighting and food category constraint is characterized by comprising the following steps:
step 1, constructing a disease weighting relation by using international disease classification data;
step 2, constructing a food similarity network by using the ingredient table;
step 3, constructing a food group relation by using a food classification system;
step 4, constructing a known binary food-disease association network;
step 5, randomly initializing the representation of the food and the diseases in the potential space;
step 6, introducing a disease weighting relation and a food group relation, and learning the representation of the potential space of the food and the disease;
step 7, outputting the correlation result of the predicted food and the disease by using the representation of the food and the disease potential space, thereby enhancing the accuracy of the food disease correlation prediction;
the specific implementation method of the step 6 comprises the following steps:
decomposing the food-disease association matrix R into the product of the food vector U and the disease vector V, the decomposition objective function is defined as:
Figure FDA0002298000470000011
defining a hierarchical relationship after disease weighting, limiting two diseases with adjacent parent-child relationships to a potential space to keep a relatively close distance:
Figure FDA0002298000470000012
wherein tr (-) represents trace, S ', of the matrix corresponding to the parenthesis'1Is a symmetry matrix of the disease; diagonal matrix (D'1)ii=∑j(S′1)ijGraph Laplacian L1=D'1-S′1,||A||2Is L of the A matrix2A regularization value; v.i、V.jColumn vectors of ith and jth columns in the V matrix; a. theTRefers to the transposition of the A matrix;
applying the common graph laplacian operator to the food similarity:
Figure FDA0002298000470000013
S2is a network of food similarity (D)2)ii=∑j(S2)ij,D2Is a diagonal matrix and the elements on the diagonal are S2A row of2=D2-S2
Introducing a food group relationship, taking the geometric center point of all foods in the potential space as a group center point, and all group members in the group should be close to the group center point; in each iteration, the center point of each group is calculated using the U and V used in the last iteration that has occurred, these points being used as fixed variables in the current iteration; the group-centered constraint is expressed as follows:
Figure FDA0002298000470000021
wherein
Figure FDA0002298000470000022
Is the jth element in food group G,
Figure FDA0002298000470000023
is the geometric center of food group G;
Figure FDA0002298000470000024
representing Euclidean distance between a member j in a group G and the center point of the group in which the member j is located; r is to be0、R1、R2Merging into underlying matrix factorization targets
Figure FDA0002298000470000025
In (3), the objective function is obtained as follows:
Figure FDA0002298000470000026
s.t.U≥0,V≥0
wherein λ0、λ1And λ2For a specified parameter, a person is selected with a value range of: lambda [ alpha ]0And λ1Selected from the set {0, 0.001, 0.01, 0.1, 1, 10, 100, 1000}, λ2And selecting from the set {1, 10, 100, 1000}, and solving to obtain representations U and V of potential spaces of food and diseases by using a gradient descent method.
2. The method of claim 1, wherein the method comprises the steps of: the specific implementation method of the step 1 comprises the following steps:
first, a disease correlation matrix S is constructed using international disease classification data1Setting an element S in a disease relevancy matrix if the expressions of the two diseases are in a parent-child relationship in the international disease classificationij1, otherwise Sij0, wherein disease i is the parent list of disease j, which is a specific subclass of disease i;
then, define the depth of the father node i(i,j)And the weight C (depth) of the depth of the edge formed by the parent node i and the child node j(i,j)) The definition is as follows:
Figure FDA0002298000470000027
C(depth(i,j))=1+log(depth(i,j))
finally, weighted based on the hierarchy, the matrix of relevance of the disease is represented as follows:
(S′1)ij=(S1)ij*C(depth(i,j))。
3. the method of claim 1, wherein the method comprises the steps of: the specific implementation method of the step 2 comprises the following steps: in the food similarity network, each node is a combination of 'food-quantity-eating method'; under the condition that the 'measuring-eating method' is different, the relation of every two nodes is set as 0; under the condition that the 'quantity-use method' is the same, calculating the similarity between every two foods by using a cosine formula according to a food ingredient table to be used as a node relation value to obtain a food similarity network S2
4. The method of claim 1, wherein the method comprises the steps of: the specific implementation method of the step 3 is as follows: constructing the relationship of the food according to the food classification system specified by the country and using the specifically classified 20 classes; food with the same quantity and the same eating method and classification is divided into a group, namely each element is a triad of 'food name-quantity-eating method'; the food is classified into different food groups according to different food properties and component ratios.
5. The method of claim 1, wherein the method comprises the steps of: the specific implementation method of the step 4 comprises the following steps: combining known food-disease associations with a binary matrix R(n×m)The expression is that the 'food name-quantity-edible method' is used as a refinement item of the food, modeling is carried out by using the 'food name-quantity-edible method-disease', the four-tuple of the verified association is set to be 1, and otherwise, the four-tuple is 0, wherein n 'food name-quantity-edible methods' are arranged in a matrix, and m diseases are listed.
6. The method of claim 1, wherein the method comprises the steps of: the specific implementation method of the step 5 is as follows: random initialization of food and disease representation in underlying space Rn×KAnd VK×m: initialization is done by assigning any number between 0-1 to each value in the two matrices.
7. The method of claim 1, wherein the method comprises the steps of: the specific implementation method of the step 7 is as follows: and performing dot multiplication on the ith row in the representation U of the food potential space and the jth column in the representation V of the disease potential space to obtain a possible relation value between the food-quantity-edible method i and the disease j.
CN201811180791.5A 2018-10-09 2018-10-09 Food-disease association prediction method based on disease weighting and food category constraint Active CN109300549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811180791.5A CN109300549B (en) 2018-10-09 2018-10-09 Food-disease association prediction method based on disease weighting and food category constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811180791.5A CN109300549B (en) 2018-10-09 2018-10-09 Food-disease association prediction method based on disease weighting and food category constraint

Publications (2)

Publication Number Publication Date
CN109300549A CN109300549A (en) 2019-02-01
CN109300549B true CN109300549B (en) 2020-03-17

Family

ID=65162268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811180791.5A Active CN109300549B (en) 2018-10-09 2018-10-09 Food-disease association prediction method based on disease weighting and food category constraint

Country Status (1)

Country Link
CN (1) CN109300549B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223786B (en) * 2019-06-13 2021-08-13 重庆亿创西北工业技术研究院有限公司 Method and system for predicting drug-drug interaction based on nonnegative tensor decomposition

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108346474A (en) * 2018-03-14 2018-07-31 湖南省蓝蜻蜓网络科技有限公司 The electronic health record feature selection approach of distribution within class and distribution between class based on word
CN108364677A (en) * 2018-03-13 2018-08-03 汤臣倍健股份有限公司 A kind of evaluating method and its device based on various dimensions health control model

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2564358T3 (en) * 2010-10-15 2016-03-22 The Trustees Of Columbia University In The City Of New York Obesity-related genes and their proteins and their uses
CN103605984B (en) * 2013-11-14 2016-08-24 厦门大学 Indoor scene sorting technique based on hypergraph study

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108364677A (en) * 2018-03-13 2018-08-03 汤臣倍健股份有限公司 A kind of evaluating method and its device based on various dimensions health control model
CN108346474A (en) * 2018-03-14 2018-07-31 湖南省蓝蜻蜓网络科技有限公司 The electronic health record feature selection approach of distribution within class and distribution between class based on word

Also Published As

Publication number Publication date
CN109300549A (en) 2019-02-01

Similar Documents

Publication Publication Date Title
Argyri et al. Rapid qualitative and quantitative detection of beef fillets spoilage based on Fourier transform infrared spectroscopy data and artificial neural networks
Ferraro et al. A toolbox for fuzzy clustering using the R programming language
Agrawal et al. Pixels to voxels: modeling visual representation in the human brain
Chhabra et al. A hybrid deep learning approach for automatic fish classification
Ahn et al. A spatial guided self-supervised clustering network for medical image segmentation
Chanadang et al. The impact of rendered protein meal oxidation level on shelf-life, sensory characteristics, and acceptability in extruded pet food
Gumustekin et al. A comparative study on Bayesian optimization algorithm for nutrition problem
CN109300549B (en) Food-disease association prediction method based on disease weighting and food category constraint
Faisal et al. Generating privacy preserving synthetic medical data
Tarancón et al. External quality of mandarins: influence of fruit appearance characteristics on consumer choice
Malik et al. From YouTube to the brain: Transfer learning can improve brain-imaging predictions with deep learning
Durán-Sandoval et al. Achieving the food security strategy by quantifying food loss and waste. A case study of the chinese economy
El Moutaouakil et al. An optimized gradient dynamic-neuro-weighted-fuzzy clustering method: Application in the nutrition field
Yılmaz et al. Classification of lemon quality using hybrid model based on Stacked AutoEncoder and convolutional neural network
CN114329233A (en) Cross-region cross-scoring collaborative filtering recommendation method and system
Gao et al. High accuracy food image classification via vision transformer with data augmentation and feature augmentation
Fathelrahman et al. Food Systems’ Transformation to Address Malnutrition in Selected Countries—Panel-Data Analysis on Undernourishment and Obesity
Eftimov et al. FoodEx2vec: New foods’ representation for advanced food data analysis
Saraswat et al. Advanced detection of fungi-bacterial diseases in plants using modified deep neural network and DSURF
Guibrunet et al. Socioeconomic, demographic and geographic determinants of food consumption in Mexico
Mishra et al. Analysis of Indian Food Based on Machine learning Classification Models
Manghi et al. On elliptical multilevel models
Apostolopoulos et al. A General Machine Learning Model for Assessing Fruit Quality Using Deep Image Features
Doesburg et al. Kinesthetic engagement in Gestalt evaluation outscores analytical ‘atomic feature’evaluation in perceiving aging in crystallization images of agricultural products
Liu Food demand in urban China: An empirical analysis using micro household data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant