CN112650866A - Catering health analysis method based on image semantic deep learning - Google Patents
Catering health analysis method based on image semantic deep learning Download PDFInfo
- Publication number
- CN112650866A CN112650866A CN202010836022.7A CN202010836022A CN112650866A CN 112650866 A CN112650866 A CN 112650866A CN 202010836022 A CN202010836022 A CN 202010836022A CN 112650866 A CN112650866 A CN 112650866A
- Authority
- CN
- China
- Prior art keywords
- dish
- image
- menu
- picture
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 10
- 230000036541 health Effects 0.000 title claims abstract description 8
- 238000013135 deep learning Methods 0.000 title claims abstract description 7
- 239000002994 raw material Substances 0.000 claims abstract description 19
- 235000015097 nutrients Nutrition 0.000 claims abstract description 16
- 230000011218 segmentation Effects 0.000 claims abstract description 11
- 238000004364 calculation method Methods 0.000 claims abstract description 6
- 235000021049 nutrient content Nutrition 0.000 claims abstract description 3
- 239000000463 material Substances 0.000 claims description 16
- 235000013305 food Nutrition 0.000 claims description 15
- 238000000034 method Methods 0.000 claims description 12
- 238000001514 detection method Methods 0.000 claims description 6
- 238000013145 classification model Methods 0.000 claims description 3
- 239000003086 colorant Substances 0.000 claims description 3
- 239000013598 vector Substances 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 2
- 235000013311 vegetables Nutrition 0.000 claims description 2
- 235000005911 diet Nutrition 0.000 description 5
- 230000037213 diet Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 235000005118 dietary health Nutrition 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000004615 ingredient Substances 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 235000004280 healthy diet Nutrition 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/60—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to nutrition control, e.g. diets
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Library & Information Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Nutrition Science (AREA)
- Epidemiology (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides a catering health analysis method based on deep learning of image semantics, which takes a menu image as input and can realize high-precision dish image classification and dish nutrient calculation. In the dish image classification part, the invention constructs a dish image classification network capable of learning the distance between the recipes, the network takes the dish image and the recipe information as input, and the information of the raw material part in the recipe is deeply understood while the image information is learned, thereby further improving the classification accuracy. In the nutrient calculation part, pixel-level semantic segmentation is carried out on the image, the image information represented by each pixel point is accurate, the proportion of various raw materials in each picture is determined, and the raw material content in the menu is further corrected. For the same dish, different pictures can return different nutrient content information, so that the nutrient calculation module is more accurate and scientific.
Description
Technical Field
The invention mainly relates to computer vision correlation technology, in particular to a dish image identification and dish image semantic segmentation technology based on deep learning.
Background
In today's society, diet health has been a topic of concern and concern to the general public. The reasonable and healthy diet can also help people to prevent diet-related diseases such as diabetes mellitus and the like. However, at the present stage, the science and popularity of dietary health are still insufficient, and most people still have insufficient understanding of truly scientific dietary health. Therefore, the diet health needs not only the improvement of attention, but also a way for helping the masses to scientifically know the diet and provide guidance suggestions with medical values, and the masses need not only a perceptual knowledge on diet, but also specific numbers and data guidance.
In the aspect of a dish analysis system, most of related catering analysis systems at the present stage have two defects: the user is required to have dish related knowledge such as raw materials, recipes and the like; the provided nutrient information is not comprehensive enough, and scientific medical guidance suggestions are lacked.
In the aspect of dish analysis algorithm, two mainstream dish image analysis technologies are currently used:
(1) a single label classifier is trained by means of a convolutional neural network, a dish corresponds to a class, and each picture obtains one class. Because pictures in the task of dish identification often have high similarity and complexity, the method cannot achieve a good effect.
(2) And training a multi-label classifier by taking the convolutional neural network as a framework, wherein each raw material corresponds to one category, and each picture obtains a plurality of categories. The method needs a large amount of extra manual information such as prior relation among dish raw materials, meanwhile, deep learning on menu pictures is not carried out, and the method still has a space for improving the accuracy.
Meanwhile, pixel-level semantic segmentation of dish picture raw materials is not performed at present, and data query aiming at dishes and staying on the surface is performed on dish nutrient calculation, namely unit-mass nutrients obtained from different pictures of the same dish are the same, and fine analysis and calculation aiming at the user pictures cannot be performed.
Disclosure of Invention
The invention provides a catering health analysis system which can meet most daily requirements on dietary health. The method can identify the corresponding dish name and the menu thereof according to the dish picture input by inquiry. And then, performing pixel-level semantic segmentation on the picture, and understanding the dish raw material information contained in each pixel in the picture, thereby accurately calculating the nutrient content information of the dish. The invention only needs to input dish pictures and the quality thereof and output a nutrient element reference table.
The technical scheme of the invention is as follows:
(1) target detection: after the user inputs the picture, the positions of the containers such as the bowl and the like are detected through a target detection method, the position of the dish is further accurate, the enclosing frame of the dish is obtained, and irrelevant influence factors such as the background are removed.
(2) The dish identification of the distance between the recipes can be learned: and (2) after the enclosing frame of the dishes is obtained according to the step (1), simultaneously learning picture and menu information through a classification model capable of learning the distance between the classes, matching the picture and the menu information, and finally obtaining five dishes with the highest matching degree with the picture and the menu thereof for a user to select.
(3) Calculating nutrient of the dish: after the name and the menu of the dishes are obtained, semantic division is performed on the dishes with various main food materials at the pixel level, the dish raw materials are divided according to colors, the proportion of each raw material is further refined, and therefore the content of nutrients is calculated more accurately.
The method has the advantages that the method does not need to be capable of automatically extracting the effective area of the dish image, and the dish type is analyzed by adopting the recognition model; and the content of the nutrients is further accurately calculated by means of a semantic segmentation model while the related nutrient information is obtained through the pictures and the quality.
Drawings
FIG. 1 is a method framework and flow chart
FIG. 2 is a diagram showing the effect of the target detection model
FIG. 3 is a frame diagram of a classification model for learnable inter-class distance
FIG. 4 is a graph showing the relationship between recipes obtained by the model shown in FIG. 3
FIG. 5 is a diagram showing pixel-level semantic segmentation effect of vegetable raw materials
Detailed Description
As shown in fig. 1, the specific flow of the catering health analysis system based on the deep learning of image semantic meanings is as follows:
And 2, entering the dish identification module by the image. First, as shown in fig. 2, redundant information such as a background is removed by using the target detection model, and a dish enclosure frame is obtained. And then judging whether the picture is the dish or not by means of a two-classifier. And finally, when the judgment result is true, entering a model shown in the figure 3 to obtain 5 dish names and the menu thereof which have the highest matching degree with the picture.
For the model illustrated in fig. 3, the model is entirely comprised of an image encoder and a text encoder. During the training process, each model input is a set of image-material pairs (images)k,ingredientk,yk)k∈[0,K]Wherein the imagekIs the kth picture, ingredientkIs the corresponding raw material, i.e. recipe, ykIt indicates whether the first two match, K being the total number of pictures. In concrete training, ykRandomly assigning values, wherein 80% of the probability is assigned as 1, 20% of the probability is assigned as 0, namely, the menus are not matched, and a menu which is not matched is randomly selected as an ingredientk. Recording the image encoder as EncodeirmageThe text encoder is denoted as EncodeirngreThen
The goal of the training is such that when the image matches the text,andas small as possible and vice versa. In the most ideal case, the vector calculated by the dish picture and the corresponding menu is completely consistent, and the non-corresponding menu is orthogonal to the dish picture.
Thus setting the loss function to
When testing and using, firstly, the recipe is assigned to each dishiCalculating and storingWhen a user inputs a picture imagequeryWhen requested, the score of the ith menu is
Fig. 4 verifies that the model learns the relationship information between recipes, and a recipe which is close to the raw material is often higher in matching score.
Step 3 nutrient query
For dishes with single main food materials, the ratio of the input quality of the user to the quality of the dishes in the menu is calculated, and then the result is obtained by multiplying the ratio by the corresponding nutrient elements of the food materials. For more than two dishes of main food materials, the proportional relation among the food materials is further refined by means of an image semantic segmentation model, and the effect is shown in fig. 5, so that the nutrient result is further accurate.
The image semantic segmentation model adopts a structure of MobileNetV2+ PPM. MobileNetV2 is a lightweight convolutional network. The common convolutional layer is divided into a DepthWise convolutional layer and a PointWise convolutional layer, so that the number of times of multiplication required by the convolution is greatly reduced. Meanwhile, the method adjusts the use condition of ReLu and adds a residual error network structure, so that the accuracy is further improved. The PPM pyramid type pooling module can better understand the context relationship, namely the relationship between the raw materials. Meanwhile, the method has good performance on the detection of small objects. The model classifies the raw materials according to colors, and better robustness and universality are obtained.
Claims (4)
1. A catering health analysis method based on deep learning of image semantics is characterized by comprising the following steps:
(1) inputting a picture, detecting the positions of containers such as bowls and the like by a target detection method, further accurately detecting the positions of dishes to obtain a surrounding frame of the dishes, and removing irrelevant influence factors such as backgrounds and the like.
(1) Identifying the dish image: the dish image without the background is used as input, the dish image and the menu are mapped to the same domain through a classification network capable of identifying the distance between the menus, so that the distance between the dish image and the menu is obtained, and the five menus closest to the image are further obtained.
(2) Calculating and correcting nutrient of the dish: and performing pixel-level image semantic segmentation on the dish image, and determining that each pixel belongs to a background or a certain dish raw material, so that the menu is further corrected, and the nutrient content is accurately calculated.
2. The method for classifying the identifiable inter-menu distance as claimed in claim 1, further comprising:
(1) the dish classification model is composed of an image encoder and a text encoder, pictures and recipes are encoded into 512-dimensional vectors respectively, and the distance relation between the recipes is learned by means of the text encoder;
(2) adopting cosine loss as a loss function to enable the result of an image encoder to be closer to that of a text encoder;
(3) the dot product of the vectors of the two encoders is used as a matching score between the picture and the menu, and the matching score of any menu and menu picture can be provided.
3. The dish nutrient calculation method of claim 1, further comprising:
(1) firstly, the vegetable food materials are divided into five types of red, yellow, green, black and white according to the colors;
(2) then, classifying each pixel point in the dish image by adopting a training semantic segmentation network, wherein the classification result is red, yellow, green, black, white and non-dish six types, and the pixel proportion among various food materials is returned;
(3) and further correcting the content of the raw materials in the menu according to the returned pixel proportion relation.
4. The recipe raw material correction method as set forth in claim 3, further comprising:
the invention defines a standard menu, and for the dishes with the name R in the standard menu, the raw materials of the dishes are expressed as (ingre)red,ingreyellow,ingregreen,ingreblack,ingrewhite,ingreother) They respectively represent the results of six categories of food materials, and the corresponding mass of the food materials in the standard menu is m ═ mred,myellow,mgreen,mblack,mwhite,mother) Defining the ratio of each food material in the ith picture of the DIMAX data setComprises the following steps:
whereinShowing the pixel size occupied by the corresponding food material in the ith picture of the R;
further calculating the volume ratio V of various food materials in R as follows:
wherein n is the total number of pictures of R in the DIMAX dataset;
for the R picture input by the user, the ratio of actual various food materials is V 'obtained through the image voice segmentation model, and the actual due quality is m', then the food material with the color of c is taken as an example:
where i, c ∈ { red, yellow, green, black, white }, m'cI.e. the quality of the food material with color c after correction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010836022.7A CN112650866A (en) | 2020-08-19 | 2020-08-19 | Catering health analysis method based on image semantic deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010836022.7A CN112650866A (en) | 2020-08-19 | 2020-08-19 | Catering health analysis method based on image semantic deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112650866A true CN112650866A (en) | 2021-04-13 |
Family
ID=75346136
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010836022.7A Pending CN112650866A (en) | 2020-08-19 | 2020-08-19 | Catering health analysis method based on image semantic deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112650866A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114360690A (en) * | 2022-03-18 | 2022-04-15 | 天津九安医疗电子股份有限公司 | Method and system for managing diet nutrition of chronic disease patient |
WO2023159909A1 (en) * | 2022-02-25 | 2023-08-31 | 重庆邮电大学 | Nutritional management method and system using deep learning-based food image recognition model |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100173269A1 (en) * | 2009-01-07 | 2010-07-08 | Manika Puri | Food recognition using visual analysis and speech recognition |
CN104730073A (en) * | 2015-02-16 | 2015-06-24 | 中国土产畜产进出口总公司 | Quantitative analysis method and system for dishes contained in plates |
CN108198188A (en) * | 2017-12-28 | 2018-06-22 | 北京奇虎科技有限公司 | Food nutrition analysis method, device and computing device based on picture |
KR20180093141A (en) * | 2017-02-09 | 2018-08-21 | 주식회사 롭썬컴퍼니 | A meal calendar system using the image processing method based on colors |
CN110852733A (en) * | 2019-10-22 | 2020-02-28 | 杭州效准智能科技有限公司 | Intelligent catering settlement system based on RFID fusion dish image matching identification |
CN111128341A (en) * | 2019-11-07 | 2020-05-08 | 北京航空航天大学 | Dish identification APP based on deep learning |
-
2020
- 2020-08-19 CN CN202010836022.7A patent/CN112650866A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100173269A1 (en) * | 2009-01-07 | 2010-07-08 | Manika Puri | Food recognition using visual analysis and speech recognition |
CN104730073A (en) * | 2015-02-16 | 2015-06-24 | 中国土产畜产进出口总公司 | Quantitative analysis method and system for dishes contained in plates |
KR20180093141A (en) * | 2017-02-09 | 2018-08-21 | 주식회사 롭썬컴퍼니 | A meal calendar system using the image processing method based on colors |
CN108198188A (en) * | 2017-12-28 | 2018-06-22 | 北京奇虎科技有限公司 | Food nutrition analysis method, device and computing device based on picture |
CN110852733A (en) * | 2019-10-22 | 2020-02-28 | 杭州效准智能科技有限公司 | Intelligent catering settlement system based on RFID fusion dish image matching identification |
CN111128341A (en) * | 2019-11-07 | 2020-05-08 | 北京航空航天大学 | Dish identification APP based on deep learning |
Non-Patent Citations (1)
Title |
---|
飞桨PADDLEPADDLE: "我用飞桨做了一个菜品图像识别***", 《HTTPS://BLOG.CSDN.NET/PADDLEPADDLE/ARTICLE/DETAILS/104666572/》, 4 March 2020 (2020-03-04), pages 1 - 9 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023159909A1 (en) * | 2022-02-25 | 2023-08-31 | 重庆邮电大学 | Nutritional management method and system using deep learning-based food image recognition model |
CN114360690A (en) * | 2022-03-18 | 2022-04-15 | 天津九安医疗电子股份有限公司 | Method and system for managing diet nutrition of chronic disease patient |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Aguilar et al. | Grab, pay, and eat: Semantic food detection for smart restaurants | |
CN107578060B (en) | Method for classifying dish images based on depth neural network capable of distinguishing areas | |
CN110837870B (en) | Sonar image target recognition method based on active learning | |
US7848577B2 (en) | Image processing methods, image management systems, and articles of manufacture | |
EP2064677B1 (en) | Extracting dominant colors from images using classification techniques | |
CN110717554B (en) | Image recognition method, electronic device, and storage medium | |
CN103699532B (en) | Image color retrieval method and system | |
CN111178120B (en) | Pest image detection method based on crop identification cascading technology | |
CN104572965A (en) | Search-by-image system based on convolutional neural network | |
CN102385592B (en) | Image concept detection method and device | |
CN112650866A (en) | Catering health analysis method based on image semantic deep learning | |
CN111599438A (en) | Real-time diet health monitoring method for diabetic patient based on multi-modal data | |
WO2017016886A1 (en) | System and method for providing a recipe | |
CN111652273A (en) | Deep learning-based RGB-D image classification method | |
CN110503140A (en) | Classification method based on depth migration study and neighborhood noise reduction | |
EP3044733A1 (en) | Image processing | |
CN111476319A (en) | Commodity recommendation method and device, storage medium and computing equipment | |
CN114241226A (en) | Three-dimensional point cloud semantic segmentation method based on multi-neighborhood characteristics of hybrid model | |
CN110097603B (en) | Fashionable image dominant hue analysis method | |
CN113705310A (en) | Feature learning method, target object identification method and corresponding device | |
CN109685146A (en) | A kind of scene recognition method based on double convolution sum topic models | |
CN114882973A (en) | Daily nutrient intake analysis method and system based on standard food recognition | |
Rimiru et al. | GaborNet: investigating the importance of color space, scale and orientation for image classification | |
CN114494827A (en) | Small target detection method for detecting aerial picture | |
Afifi | Image retrieval based on content using color feature |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |