AU2019100969A4 - Chinese Food Recognition and Search System - Google Patents
Chinese Food Recognition and Search System Download PDFInfo
- Publication number
- AU2019100969A4 AU2019100969A4 AU2019100969A AU2019100969A AU2019100969A4 AU 2019100969 A4 AU2019100969 A4 AU 2019100969A4 AU 2019100969 A AU2019100969 A AU 2019100969A AU 2019100969 A AU2019100969 A AU 2019100969A AU 2019100969 A4 AU2019100969 A4 AU 2019100969A4
- Authority
- AU
- Australia
- Prior art keywords
- image
- images
- chinese
- layers
- search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 235000013305 food Nutrition 0.000 title claims abstract description 21
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 13
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 240000007124 Brassica oleracea Species 0.000 abstract description 8
- 235000003899 Brassica oleracea var acephala Nutrition 0.000 abstract description 8
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 abstract description 8
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 abstract description 8
- 244000061456 Solanum tuberosum Species 0.000 abstract description 8
- 235000002595 Solanum tuberosum Nutrition 0.000 abstract description 8
- 235000013527 bean curd Nutrition 0.000 abstract description 8
- 238000013135 deep learning Methods 0.000 abstract description 6
- 238000007781 pre-processing Methods 0.000 abstract description 5
- 239000000284 extract Substances 0.000 abstract 1
- 230000026676 system process Effects 0.000 abstract 1
- 238000000034 method Methods 0.000 description 12
- 235000021186 dishes Nutrition 0.000 description 11
- 244000046052 Phaseolus vulgaris Species 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 3
- 238000013434 data augmentation Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 235000021403 cultural food Nutrition 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000000378 dietary effect Effects 0.000 description 1
- 235000005489 dwarf bean Nutrition 0.000 description 1
- 235000005686 eating Nutrition 0.000 description 1
- 235000006694 eating habits Nutrition 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 235000012015 potatoes Nutrition 0.000 description 1
- 235000015067 sauces Nutrition 0.000 description 1
- 235000014347 soups Nutrition 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/68—Food, e.g. fruit or vegetables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
Abstract An image recognition and search system of Chinese dishes is based on deep learning algorithm. And it is able to recognize four popular Chinese dishes: Potato Silk, Baby Cabbage, Mapo Tofu and Fried Beans and then search in local files to display a similar image in the same type. Our image recognition system mainly includes three parts: Data preprocessing, Training model and Graphic user interface. The input images are collected by the searching keyword on the Internet. After the Data preprocessing, in which we present out effort to remove the irrelevant images, we transforms the input image to the format of 32X32 resolution. Then the dataset we collected are separated into two parts: Training set and the Test set. And we build out convolutional neural network (CNN) which consists of 4 convolutional layers and 2 full connected layers. Then this system processes extracts features of input data by going through convolutional layers and classifies the image through full connected layers. Our Chinese food recognition system achieves an average accuracy of 83.25% on the test. By using the Graphic User Interface, the users can upload the target picture and the operations and results will be displayed on the textfield.
Description
Chinese Food Recognition and Search System
FIELD OF INVENTION
The present invention is categorized in the field of image identification in image processing and machine learning. More particularly, the invention, a Chinese dish image recognition system, relates to deep learning and neural network.
BACKGROUND OF THE INVENTION
Food plays a quiet important role in everyone’s daily life. In recent years, with the development of technology, mobile phones and computers have become essential tools in our lives. An increasing number of people prefer to take a photo of the food of what they cooked eat, and then share the picture on the social networks such as WeChat or Twitter. In real life, people are curious about the dishes that they have never met before. But we know that Chinese dishes include so many different types of dishes with different names, generally normal people can only tell a few kinds of dishes without any tools. Consequently, helping normal people to recognize different dishes has become a necessary need. Besides dish recognition and search can be applied in other areas, such as Micro POS in restaurants, intelligent plates which can give an introduction of the dish.
2019100969 29 Aug 2019
Dish recognition is a problem related to classification. To do this, generally we give different types of dishes different labels. This process requires an automatic method for image recognition. In traditional image recognition, we usually have to extract different features from different objects and regard them as definitions of these objects. When processing image recognition, the computer compares the image matrices with different definitions and finds the closest one. However, dish recognition is much more difficult than any other normal objects, because some dishes are quite similar, especially Chinese dishes. Consequently, traditional image recognition is hard to recognize some similar dishes in an accuracy.
In recent years, with the development of high performance computing platform and big data processing techniques, deep learning technology has become a powerful method for image recognition. Deep learning is to build a neural network model with several hidden layers and a large number of datasets for training to learn more useful features, and thus improve the accuracy of the prediction.
In this invention, we use TensorFlow, which is a framework that is widely used for deep learning applications. In the whole process we collect dish images on our own by using a web crawler which can download pictures from BaiduPicture automatically. We also label different pictures on our own. What makes us different from other dish
2019100969 29 Aug 2019 recognition processes is that the pictures we use for training are all in the size of 32X32 while others are much more lager than this, like Resnet or VGG using 224X224 or bigger [1,2], This means that this invention doesn’t have high requirements for images, which makes it much more accessible and faster than other methods. After data processing, we feed the dataset for training into the convolutional neural network in batches. Through the whole training process, the program will optimize sets of weights and biases through all layers of the whole neural network to minimizing the loss function. By adjusting the parameters of the network, this model can achieve an optimal performance and a high accuracy.
SUMMARY OF THE INVENTION
This invention is a Chinese food recognition system which includes three parts: Data preprocessing, Training model and Graphic user interface. It is able to recognize 4 popular Chinese food (Potato Silk, Baby Cabbage, Mapo Tofu and Fried beans) and can be retrained for any types of other food and then search in the local file to display one similar image in the same type.
1. Data Description
There are a lot of food styles in Chinese history and food culture, such as Sichuan cuisine, Northeastern cuisine, etc. Our Chinese food dataset
2019100969 29 Aug 2019 contains 4 most popular Chinese cooks in different cuisines, which are Potato Silk, Baby Cabbage, Mapo Tofu and Fried Beans. These 4 types of food are gathered from the internet. However, some of them are mislabeled since they are uploaded privately without rechecking. We need to remove these irrelevant images to ensure the consistency. Some food samples are shown in Figure 1(a)
2. Data Preprocessing
Data Clean and Label
After collecting the food images, we first cleaned these images and generate the corresponding labels for each image. Then we removed irrelevant images and images with irregular height or width (too large or too small) which may be distort after being resized. Finally, we resize the all images into the same size of 32X32 (shown in Figure 1(b)).
Data Augmentation
We did data augmentation to increase the samples from about 1100 images each label to 5500 images each label. We achieved that by randomly rotating, shifting zooming images in a small range using Keras (shown in Figure 1(c)).
3. Model
The schematic diagram of the CNN model in our invention is shown
2019100969 29 Aug 2019 in Figure 2. Randomly cropped patches in size of 32X32 from the original images are used as input. The CNN network consists of 4 convolutional layers all with output depth as 32, using Relu activation functions while only the second and fourth layers using max pooling. After pass through a CNN network, the input is decoded and flattened as a feature vector, which is classified by the following two full-connected layers. The last layer has 4 nodes with sigmoid activation functions, where the outputs are used to calculate the loss and possibilities for each labels. The model is trained by implementing the back-propagation algorithm using only one CPU and with batch gradient descent and Adam optimizer.
4. Graphic User Interface
The GUI for this invention is shown in Figure 3. We can use “Open a Target Image” button to load an image to recognize. Then the recognition system will read the image and output the classification result, search and display an image having the same name by clicking “Show Recognition Result” button. The textfield on the right displays the operations and results and the image on the left is a similar image searched by the system.
DESCRIPTION OF THE DRAWINGS
2019100969 29 Aug 2019
The appended drawings are only for the purpose of description and explanation but not for limitation, wherein:
1. Fig.l: Samples of original images, (b) Images reshaped into size of 32X32. (c) Data augmentation by rotating, shifting and zooming images.
2. Fig.2 is the schematic diagram of the CNN model.
3. Fig.3 is the loss changes with training steps.
DESCRIPTION OF PREFERRED EMBODIMENT
Food image recognition is one of the most promising applications of visual object recognition, since it will not only help people find what they are eating but help estimate food calories and analyze people’s eating habit for the sake of health. Meanwhile,CNN (Convolutional Neural Network), is currently one of the most widely used deep learning methods in machine learning due to its powerful modelling capability on complex and large-scale datasets. We find it a great idea to apply CNN to food image recognition.
Due to the fact that many food items are indistinguishable in terms of shape or color, some food characteristics are even hard to be recognized by simple examination and it’s extremely challenging to identify every food item. Therefore, we state that it would be a better
2019100969 29 Aug 2019 option to generally classify and identify food items to attempt to automatically approximate its dietary information.
1. Data collection:
In the implementation of the project, we pay much attention to the quality of images which are used to train our model to guarantee the precision. We used a crawler tool to download images from web including Baidu image search and Google image search with acceptable quality. We prepared a total of four categories of dishes (Dry-cooked string beans, stir-fried tofu in hot sauce, shredded potatoes and baby cabbage in chicken soup) and each dish has about 5,000 images.
Potato Silk | [1,0,0,0] |
Baby Cabbage | [0,1,0,0] |
Mapo Tofu | [0,0,1,0] |
Fried Beans | [0,0,0,1] |
Table 1
Table l:the one-hot encoding for labels
2. Data pre-processing
Furthermore, we re-processed the collected images and rotated them
2019100969 29 Aug 2019 at different angles to simulate different camera angles in real life to enhance the recognition rate of the code. In order to open up new areas, we decided to do image recognition for low resolution images. To achieve it, we converted the pixels of all the original image into shape of 32X32, so that we can reach a high accuracy even with a low-quality picture.
After collecting images from internet, categorical labels are not found to be useful for this project, because the number of possible values is often limited to fixed set. And many machine learning algorithms only operate on numeric label data directly. And we also apply one-hot encoding to the integer representation. For example: we first map our dishes as: “0” represents Potato Silk, “1” represents Baby Cabbage, “2” represents Mapo Tofu and “3” represents Fried beans. Then we convert the integer labels into one-hot encoding and the final results are shown in table 2.
3. Architecture and training process
Our architecture consists of an input layer, 4 convolutional layers and 2 fully connected layers. We input images of size 32X32, Which is a tensor in shape of (256,32,32,3), then after passing through the budget of the four convolutional layers, including maxpooling layer to achieve downsampling and reduce the size of the data space, we extracted the features of the images. Then images are classified by passing through full
2019100969 29 Aug 2019 connected layers.
We employed a dropout and L2 regularization technique to avoid overfitting in the training phase and used our own framework built by tensorflow to run the experiments. During the process of training the data sets, dropout rate, base learning rate, decay rate and iteration steps are 4 main parameters for training.
In machine learning, accuracy is one of the most important metric for evaluating models. Moreover, some other crucial metrics, such as confusion matrix and standard deviation, are also measured in our invention. In our model, the average accuracy reaches 83.25% with the standard deviation of 1.793 and the confusion matrix is shown in Table 2.
Table.2 is the confusion matrix for the model
Potato Silk | Baby Cabbage | Mapo Tofu | Fried Beans | ||
Potato Silk | 808 | 171 | 20 | 22 | |
Baby Cabbage | 180 | 780 | 51 | 44 | |
Mapo Tofu | 17 | 61 | 880 | 17 |
2019100969 29 Aug 2019
Fried | 12 | 16 | 52 | 869 | |
Beans |
Table 2
Claims (2)
- ClaimsWhat is claimed is:1. A convolutional neural network (CNN) framework written in python and tensorflow which can be used to construct neural network.
- 2. The whole existing recognition system comprising:a) a trained CNN model able to classify images;b) a graphic user interface (GUI);c) Chinese food dataset collected for training.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2019100969A AU2019100969A4 (en) | 2019-08-29 | 2019-08-29 | Chinese Food Recognition and Search System |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2019100969A AU2019100969A4 (en) | 2019-08-29 | 2019-08-29 | Chinese Food Recognition and Search System |
Publications (1)
Publication Number | Publication Date |
---|---|
AU2019100969A4 true AU2019100969A4 (en) | 2019-10-03 |
Family
ID=68063082
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU2019100969A Ceased AU2019100969A4 (en) | 2019-08-29 | 2019-08-29 | Chinese Food Recognition and Search System |
Country Status (1)
Country | Link |
---|---|
AU (1) | AU2019100969A4 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110780978A (en) * | 2019-10-25 | 2020-02-11 | 下一代互联网重大应用技术(北京)工程研究中心有限公司 | Data processing method, system, device and medium |
CN112488301A (en) * | 2020-12-09 | 2021-03-12 | 孙成林 | Food inversion method based on multitask learning and attention mechanism |
CN112508072A (en) * | 2020-11-30 | 2021-03-16 | 云南省烟草质量监督检测站 | Cigarette true and false identification method, device and equipment based on residual convolutional neural network |
-
2019
- 2019-08-29 AU AU2019100969A patent/AU2019100969A4/en not_active Ceased
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110780978A (en) * | 2019-10-25 | 2020-02-11 | 下一代互联网重大应用技术(北京)工程研究中心有限公司 | Data processing method, system, device and medium |
CN110780978B (en) * | 2019-10-25 | 2022-06-24 | 赛尔网络有限公司 | Data processing method, system, device and medium |
CN112508072A (en) * | 2020-11-30 | 2021-03-16 | 云南省烟草质量监督检测站 | Cigarette true and false identification method, device and equipment based on residual convolutional neural network |
CN112508072B (en) * | 2020-11-30 | 2024-04-26 | 云南省烟草质量监督检测站 | Cigarette true and false identification method, device and equipment based on residual convolution neural network |
CN112488301A (en) * | 2020-12-09 | 2021-03-12 | 孙成林 | Food inversion method based on multitask learning and attention mechanism |
CN112488301B (en) * | 2020-12-09 | 2024-04-16 | 孙成林 | Food inversion method based on multitask learning and attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Deep learning in food category recognition | |
CN107578060B (en) | Method for classifying dish images based on depth neural network capable of distinguishing areas | |
Chen et al. | Chinesefoodnet: A large-scale image dataset for chinese food recognition | |
US9734426B2 (en) | Automated food recognition and nutritional estimation with a personal mobile electronic device | |
AU2019100969A4 (en) | Chinese Food Recognition and Search System | |
Aslan et al. | Benchmarking algorithms for food localization and semantic segmentation | |
US20240119646A1 (en) | Text editing of digital images | |
CN108647702B (en) | Large-scale food material image classification method based on transfer learning | |
CN109766465A (en) | A kind of picture and text fusion book recommendation method based on machine learning | |
AU2019101149A4 (en) | An Image retrieval System for Brand Logos Based on Deep Learning | |
Sathish et al. | Analysis of Convolutional Neural Networks on Indian food detection and estimation of calories | |
CN112906780A (en) | Fruit and vegetable image classification system and method | |
Tripathi et al. | Detection of various categories of fruits and vegetables through various descriptors using machine learning techniques | |
Shao et al. | Research on automatic identification system of tobacco diseases | |
Suddul et al. | A comparative study of deep learning methods for food classification with images | |
Nirmal et al. | Pomegranate leaf disease detection using supervised and unsupervised algorithm techniques | |
JP6995262B1 (en) | Learning systems, learning methods, and programs | |
Al-Tuwaijari et al. | Deep Learning Techniques Toward Advancement of Plant Leaf Diseases Detection | |
Hussein | Feature weighting based food recognition system [J] | |
Lan et al. | [Retracted] Accurate Real‐Life Chinese Dish Recognition | |
Yang et al. | Multi-Growth Period Tomato Fruit Detection Using Improved Yolov5 | |
Nilsson et al. | A comparison of image and object level annotation performance of image recognition cloud services and custom Convolutional Neural Network models | |
Ramkumar et al. | A Real-time Food Image Recognition System to Predict the Calories by Using Intelligent Deep Learning Strategy | |
Anggoro et al. | Classification of Solo Batik patterns using deep learning convolutional neural networks algorithm | |
Bao et al. | Predicting and Visualizing Citrus Color Transformation Using a Deep Mask-Guided Generative Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FGI | Letters patent sealed or granted (innovation patent) | ||
MK22 | Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry |