CN110968672B

CN110968672B - False public opinion identification method for food safety based on neural network

Info

Publication number: CN110968672B
Application number: CN201911220854.XA
Authority: CN
Inventors: 徐泽龙; 左敏; 张青川; 蔡圆媛
Original assignee: Beijing Technology and Business University
Current assignee: Beijing Technology and Business University
Priority date: 2019-12-03
Filing date: 2019-12-03
Publication date: 2022-06-10
Anticipated expiration: 2039-12-03
Also published as: CN110968672A

Abstract

The invention discloses a food safety false public opinion identification method based on a neural network, relates to the field of artificial intelligence, and can monitor network public opinions and screen out false news. The method comprises the following steps: building a food risk factor entity library; building a food name entity library; constructing a dynamic official news rumor-breaking library; building a real and false news neural network classification model; inputting latest news public sentiments, marking the food names and the risk factors involved in the news by comparing and searching in the food risk factor entity library and the food name entity library, preliminarily classifying the news, then carrying out similarity comparison according to the official news rumor splitting library, and if the related rumor splitting news is not searched, then carrying out true and false news classification by using a neural network model.

Description

False public opinion identification method for food safety based on neural network

Technical Field

The invention relates to the field of artificial intelligence, in particular to a false public opinion identification method for food safety based on a neural network.

Background

At present, a large amount of news public opinion reports related to food safety are generated every day on the network, if the information is left alone and is not supervised, some false news can cause unnecessary social panic and disturb social order, so that the production life of people is influenced, the research and the development of safety monitoring and early warning comprehensive information of key varieties of food safety supervision can be realized, the collection, the analysis and the information distribution of data information such as food safety supervision key variety supervision spot check, risk monitoring and the like can be realized, the supervision efficiency is greatly improved, the targets of deep mining, full utilization and information sharing of food safety information can be achieved, beneficial experience reference is provided for the food safety supervision informatization construction, and a powerful tool is provided for food safety supervision departments and consumers to avoid food harm. In the supervision engineering, the method can rapidly conduct public opinion on news and rumors, and has great research significance in applying natural language processing technology to help conduct food safety supervision.

Disclosure of Invention

The invention provides a method for classifying news texts in order to screen out false news related to food safety on the network, which aims to meet the requirement of monitoring news related to food safety on the Internet at present.

The method provided by the invention comprises the following steps: a false public opinion identification method based on neural network for food safety comprises the following steps:

step 1, constructing a bottom knowledge base which comprises a food name database and a food risk factor database;

step 2, constructing an official news rumor splitting library updated in real time;

step 3, constructing a method of combining a bidirectional long-and-short-term memory network and a conditional random field as a neural network model for entity recognition, and training by using news corpora with correctly marked entity names and risk factor names to obtain a final model; during prediction, vector representation of a news corpus sequence is used as input, and a labeling result of the sequence is obtained and used as output; obtaining the food name and the risk factor name related in the news according to the labeling result, judging whether the labeling result is reliable or not by combining a bottom knowledge base, and preliminarily determining the truth of the news;

And 4, constructing a convolutional neural network model as a news public opinion classification model, training by using news corpora which are correctly labeled with true and false to obtain the convolutional neural network classification model, and inputting the labeling result in the step 3 when predicting by using a neural network to finally obtain true and false classification of the news public opinion as output.

Further, the food name database includes names of existing foods, including food processed products, edible oils, fats and oils and products thereof, seasonings, meat products, dairy products, beverages, instant foods, cookies, cans, frozen drinks, quick-frozen foods, potato chips and puffed foods, confectionery products, tea and related products, wines, vegetable products, fruit products, roasted and nut products, egg products, cocoa and roasted coffee products, sugar, aquatic products, starch and starch products, cakes, bean products, bee products, health foods, special dietary foods, special medical use formula foods, infant formula foods, catering foods, edible agricultural products, food additives; for some newly generated foods, the food item name database is replenished in time by periodic updates.

Further, the food risk factor refers to a substance whose content is required to be monitored to be excessive in food, and the names of the food risk factors in the food risk factor database include: lead, benzoic acid, nitrite, sunset yellow, total colony count and different food risk factor names corresponding to different foods; for some newly defined or newly discovered risk factors, the air risk factor database is timely supplemented by periodic updates.

Further, a two-way long-and-short-term memory network and a conditional random field model are trained to serve as an entity recognition model capable of accurately recognizing food names and food risk factor names.

Further, in the step 1, after the food name database and the food risk factor database are constructed, the direct relationship between the two underlying knowledge bases is connected, the food factors to be detected of each food are associated, and the maximum content value is recorded.

Further, in step 3, the news public sentiment is labeled, the labeled content comprises a food name and a food risk factor name related to the news public sentiment, and the truth of the news public sentiment is preliminarily judged according to the labeling result and an official news rumor library.

Further, training a convolutional neural network model to serve as a classification model of news public sentiment, converting news corpora into vector representation to serve as input of a neural network, training the neural network classification model, building a model by using the convolutional neural network, and training the existing news corpora to obtain a final model; the final output result of the model is true and false of news public sentiment.

Further, training two neural network models, including a bidirectional long-time and short-time memory network model and a conditional random field model for identifying the food entity name and the risk factor name, and a convolutional neural network model for news and public opinion classification; when training is started, the weight is initialized randomly, after the last layer of result is obtained through calculation of a neural network, the cross entropy between a predicted value and a true value is calculated to serve as a loss function, the loss function is minimized through an adaptive moment estimation algorithm, and the learning rate is adjusted according to the training process; in the training process, in order to improve the training efficiency, one batch of data is input every time, and meanwhile, in order to prevent overfitting, a certain proportion of weight values are randomly set to be 0 in the training process.

The method has the advantages that the truth of news public sentiment can be judged quickly and efficiently, main food names and entity risk factor names involved in news are marked clearly, analysis of the whole news event is shown clearly, and a monitor is assisted to make a correct decision.

Drawings

FIG. 1 is a schematic flow chart of a neural network-based food safety false public opinion identification method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of the dynamic update of the underlying knowledge base;

FIG. 3 is a network diagram of entity identification;

FIG. 4 is a schematic diagram of a convolutional neural network classification.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than all embodiments, and all other embodiments obtained by a person skilled in the art based on the embodiments of the present invention belong to the protection scope of the present invention without creative efforts.

According to an embodiment of the invention, the invention provides a false public opinion identification method for food safety based on a neural network, which comprises the following steps:

And 4, constructing a convolutional neural network model as a news public opinion classification model, training by using news corpora which are correctly labeled with true and false to obtain the convolutional neural network classification model, and finally obtaining true and false classification of news public opinions as output by using the labeling result in the step 3 as input when the neural network is used for prediction.

Referring to fig. 1, showing an overall schematic diagram of the method provided by the invention, a news public opinion text is used as input, entity names involved in the text are marked through entity identification, a self-constructed bottom knowledge base is provided, the bottom knowledge base comprises a food name knowledge base and a food risk factor knowledge base, matching is carried out according to the bottom knowledge base, evidence support is obtained from an official rumor library according to matched information, a judgment result is directly given for a text with sufficient evidence, a text with insufficient evidence is converted into vector representation and input into a neural network classification model, and finally the network gives true and false news as a final output result.

In the embodiment shown in fig. 2, the bottom knowledge base includes a food name knowledge base and a food risk factor base, and the food name knowledge base is constructed by the invention and stores the basic attributes of food names, short names and the like; the food risk factor library comprises the names of risk factors to be detected in food, the names of the corresponding food, the maximum content in detection, the detection method and other attributes, and because the conditions of the same risk factor to be detected exist among different foods, the judgment conditions of the risk factors in different foods are different. In addition, the underlying knowledge base needs to be dynamically updated, the method has real-time or regular updating for some newly generated foods or newly discovered food risk factors, and finally determines whether to dynamically update the underlying knowledge base or not by comparing information in the official public rumor library for food names and food risk factor names which are not matched from the underlying entity library.

In another embodiment shown in fig. 3, a news text is converted into a vector representation as an input of a bidirectional long-short memory network, and the bidirectional long-short memory network and a conditional random field are combined to obtain an entity name contained in the text, so that a question Q (w) with the input length of n is provided₁,w₂,…,w_n) Wherein w is_iRepresents the ith word; using one-hot encoding to obtain a vector representation, X (X), for each word₁,x₂,…,x_n)，x_iA vector representation representing the ith word;then inputting X into two different long-time memory networks in a positive sequence and a reverse sequence respectively, and finally obtaining a state h at a time t_tContaining the context information at the moment. The output of the bidirectional long-time memory network layer is each Chinese character w_tThe probability marked as each label and the final probability matrix are used as the input of the conditional random field layer to calculate the scores of different label sequences, and this way, unreasonable label sequences, such as 'B-ER, O, I-ER' and the like, can be effectively avoided.

In the embodiment shown in fig. 4, for a news text which does not have enough evidence to prove the truth of the news text, it needs to be judged through a neural network model, the processed news text is vectorized and represented, and finally, through n layers of convolutional layers and pooling layers, the probability of each final classification of the news is obtained through a softmax function through a full connection layer, and the formula of the softmax function is as follows:

The function maps the output of the neuron into the interval (0,1), where n represents the number of classes, i represents a class in j, and g_iA value, P(s), representing the classification_i) Representing the probability of the ith class.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but various changes may be apparent to those skilled in the art, and it is intended that all inventive concepts utilizing the inventive concepts set forth herein be protected without departing from the spirit and scope of the present invention as defined and limited by the appended claims.

Claims

1. A false public opinion identification method of food safety based on neural network is characterized by comprising the following steps:

step 3, constructing a method of combining a bidirectional long-and-short-term memory network and a conditional random field as a neural network model for entity recognition, and training by using news corpora with correctly marked entity names and risk factor names to obtain a final model; during prediction, vector representation of a news corpus sequence is used as input, entity names related in a text are marked through entity identification, matching is carried out according to a bottom knowledge base, and then a marking result of the sequence is obtained according to matched information and is used as output; obtaining the food name and the risk factor name related in the news according to the labeling result, judging whether the labeling result is reliable or not by combining a bottom knowledge base, and preliminarily determining the truth of the news; obtaining evidence support from an official rumor-seeking library, directly giving a judgment result for a text with sufficient evidence, converting the text with insufficient evidence into vector representation, and inputting the vector representation into a neural network classification model;

Step 4, constructing a convolutional neural network model as a news public opinion classification model, training by using news corpora which are correctly labeled with true and false to obtain the convolutional neural network classification model, and when the neural network is used for prediction, taking the labeling result in the step 3 as input, and finally obtaining true and false classification of news public opinion as output;

further comprising: training two neural network models, including a two-way long-time memory network model and a conditional random field model for recognizing food entity names and risk factor names, and a convolutional neural network model for news public opinion classification; when training is started, the weight is initialized randomly, after the last layer of result is obtained through calculation of a neural network, the cross entropy between a predicted value and a true value is calculated to serve as a loss function, the loss function is minimized through an adaptive moment estimation algorithm, and the learning rate is adjusted according to the training process; in the training process, in order to improve the training efficiency, one batch of data is input every time, and meanwhile, in order to prevent overfitting, a certain proportion of weight values are randomly set to be 0 in the training process.

2. The false public opinion recognition method based on neural network for food safety as claimed in claim 1, wherein the false public opinion recognition method comprises:

The food name database comprises the names of existing foods, including food processing products, edible oil, grease and products thereof, seasonings, meat products, dairy products, beverages, instant foods, biscuits, cans, frozen drinks, quick-frozen foods, potato chips and puffed foods, candy products, tea leaves and related products, wines, vegetable products, fruit products, roasted food and nut products, egg products, cocoa and roasted coffee products, sugar, aquatic products, starch and starch products, cakes, bean products, bee products, health foods, special dietary foods, special medical purpose formula foods, infant formula foods, catering foods, edible agricultural products and food additives; for some newly generated foods, the food item name database is replenished in time by periodic updates.

3. The false public opinion recognition method based on neural network for food safety as claimed in claim 1, wherein the false public opinion recognition method comprises:

the food risk factor refers to a substance with the content exceeding the standard or not, and the food risk factor name in the food risk factor database comprises the following components: lead, benzoic acid, nitrite, sunset yellow, the total number of colonies, and the names of food risk factors corresponding to different foods are different; for some newly defined or newly discovered risk factors, the risk factor database is replenished in time by periodic updates.

4. The false public opinion recognition method for food safety based on neural network as claimed in claim 1, wherein:

and training a bidirectional long-time memory network and a conditional random field model as an entity recognition model capable of accurately recognizing food names and food risk factor names.

5. The false public opinion recognition method for food safety based on neural network as claimed in claim 1, wherein:

in the step 1, after the food name database and the food risk factor database are constructed, the direct relation between the two bottom knowledge bases is connected, the food factors to be detected of each food are associated, and the value of the maximum content is recorded.

6. The false public opinion recognition method for food safety based on neural network as claimed in claim 1, wherein:

and step 3, marking the news public sentiment, wherein the marked content comprises a food name and a food risk factor name related to the news public sentiment, and preliminarily judging the truth and falseness of the news public sentiment according to a marking result and an official news rumor library.

7. The false public opinion recognition method for food safety based on neural network as claimed in claim 1, wherein:

Training a convolutional neural network model as a classification model of news public sentiment truth and falseness, converting news corpora into vector representation as input of a neural network, training the neural network classification model, building a model by using the convolutional neural network, and training through the existing news corpora to obtain a final model; the final output result of the model is the truth and falseness of the news public opinion.