CN111507320A - Detection method, device, equipment and storage medium for kitchen violation behaviors - Google Patents

Detection method, device, equipment and storage medium for kitchen violation behaviors Download PDF

Info

Publication number
CN111507320A
CN111507320A CN202010617901.0A CN202010617901A CN111507320A CN 111507320 A CN111507320 A CN 111507320A CN 202010617901 A CN202010617901 A CN 202010617901A CN 111507320 A CN111507320 A CN 111507320A
Authority
CN
China
Prior art keywords
violation
kitchen
public
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010617901.0A
Other languages
Chinese (zh)
Inventor
谢雨洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202010617901.0A priority Critical patent/CN111507320A/en
Publication of CN111507320A publication Critical patent/CN111507320A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application relates to the field of biological identification, and provides a kitchen violation detection method, a kitchen violation detection device, computer equipment and a storage medium, wherein the method comprises the following steps: obtaining a kitchen violation picture, constructing a public violation picture set A and a real violation picture set B, respectively constructing a public violation region set a and a real violation region set B by applying the violation pictures in the public violation picture set A and the real violation picture set B, then constructing a coding-decoding model, respectively carrying out two-time iterative training on the model by using the public violation picture set A, the public violation region set a, the real violation picture set B and the real violation region set B, obtaining a kitchen violation detection model, and inputting an image shot by a kitchen camera in real time into the detection model for violation detection. In addition, the invention also relates to a block chain technology, and the detection model data information can be stored in the block chain. The method and the device are based on deep learning, can accurately extract the illegal visual features, are not limited by a kitchen scene, and have strong generalization capability.

Description

Detection method, device, equipment and storage medium for kitchen violation behaviors
Technical Field
The present application relates to the field of biometric identification, and in particular, to a method, an apparatus, a device, and a storage medium for detecting kitchen violation.
Background
Computer vision is a technology which is mature day by day in recent years, and an intelligent video monitoring technology based on the computer vision technology is widely applied to scenes such as restaurants, companies, gymnasiums, construction sites, railway stations and the like.
The general method for intelligent catering video monitoring based on computer vision technology is that firstly, a refined target in a current video frame is detected by adopting a target detection method, then actions and clothes of operators in the current frame are judged based on semantic information of a plurality of kitchens, so that a decision is made to judge whether the operators in the current frame all meet the kitchen standard, pictures with violation of the operators in the video frame are found and pushed to a catering manager, the human capital for finding out the violation of the kitchen operation in the catering industry is saved, and a reliable solution is provided for effective management and supervision of various shops in the catering industry.
However, the conventional image and video detection algorithm has certain limitations, firstly, a fixed characteristic convolution kernel is designed by field experts according to experience, the generalization capability is weak, and the detection requirements of a plurality of scenes cannot be met; secondly, the traditional detection algorithm can only extract shallow image features, and cannot obtain deep semantic information of the image, so that the detection precision cannot be guaranteed.
Disclosure of Invention
In order to solve the technical problems of weak generalization ability and low detection accuracy in the existing computer vision technology, the invention provides a deep learning-based detection method for kitchen violation behaviors, which can accurately extract violation visual features, is not limited by a kitchen scene and has strong generalization ability. The specific implementation mode is as follows:
a method for detecting kitchen violation, comprising:
acquiring a kitchen violation picture disclosed on a network, constructing a public atlas A, acquiring a kitchen violation picture captured by a camera in an actual kitchen scene, constructing a real atlas B, constructing a public violation area set a by applying the violation picture in the public atlas A, and constructing a real violation area set B by applying the violation picture in the real atlas B;
constructing a coding-decoding detection model based on a convolutional neural network structure;
the encoding-decoding model comprises an encoder and a decoder, wherein the decoder is a decoupling structure of the encoder;
carrying out first iterative training on the coding-decoding model by taking the illegal picture of the public map set A and the labeled file in the public illegal region set a as training samples, and then carrying out second iterative training on the coding-decoding model by taking the labeled file in the real map set B and the labeled file in the real illegal region set B as training samples to obtain a kitchen illegal behavior detection model;
and acquiring an image shot by the kitchen camera in real time and inputting the image into a kitchen violation detection model for violation detection.
In one possible embodiment, the obtaining of the picture of kitchen violation behavior disclosed on the network constructs a public atlas a, and the method further includes:
balancing the sample distribution in the public atlas A to obtain a public atlas A with uniformly distributed samples;
and performing image enhancement on the samples in the public atlas A with the uniformly distributed samples to obtain the public atlas A with diverse sample data.
In one possible embodiment, the constructing a public violation area set a by applying the violation pictures in the public atlas a includes:
and marking the violation region of the kitchen violation behavior picture in the public atlas A by using a marking tool, generating the marking file, and constructing a public violation region set a by using the marking file as sample data.
In one possible embodiment, the training of the first iteration of the coding-decoding model includes:
performing feature matching training on the coding-decoding model to obtain a detection model with accurate feature matching;
and carrying out violation classification training on the detection model with the accurate feature matching.
In one possible embodiment, the training of feature matching on the coding-decoding model includes:
when the intersection ratio IOU1 is smaller than a first threshold value R1 in a preset threshold value set R, judging that the prediction is wrong, and finishing the training;
comparing the odds ratio IOU1 with a second threshold R2 of the preset set of thresholds R when the odds ratio IOU1 is greater than the first threshold R1;
when the intersection ratio IOU1 is less than the second threshold R2, determining that the prediction is wrong;
and when the cross-over ratio IOU1 is larger than the second threshold value R2, taking the result of the original prediction interval [ R1, R2] as a negative sample, balancing the positive and negative samples for retraining, calculating the cross-over ratio IOU2 again and continuously comparing with the next threshold value in the preset threshold value set R, iteratively improving the cross-over ratio, and finishing the feature matching training if the prediction is wrong or all the threshold values in the preset threshold value set R are compared.
In one possible embodiment, the performing violation classification training on the detection model with accurate feature matching includes:
calculating a cross entropy loss function value between the encoding-decoding model input data layer and the output data layer;
and when the cross entropy loss function value is larger than a preset threshold value K, adjusting the classification network parameters of the coding-decoding model according to the cross entropy loss function value to obtain a new output result, recalculating the cross entropy loss function value of the input layer and the output layer of the coding-decoding model, and ending the training until the cross entropy loss function value is smaller than the preset threshold value K through multiple iterations.
In one possible embodiment, before the training of the first iteration of the coding-decoding model, the method further includes:
and inputting the training sample into an encoder and encoding the training sample by using the encoder to obtain an m-dimensional vector for calculating the cross entropy loss function value.
A kitchen violation detection device, comprising:
a sample library construction module: the method comprises the steps of obtaining a kitchen violation picture disclosed on a network, constructing a public atlas A, obtaining a kitchen violation picture captured by a camera in an actual kitchen scene, constructing a real atlas B, constructing a public violation area set a by applying the violation picture in the public atlas A, and constructing a real violation area set B by applying the violation picture in the real atlas B.
A model initialization module: and constructing a coding-decoding detection model based on a convolutional neural network structure.
A model training module: and performing first iterative training on the coding-decoding model by taking the illegal picture of the public map set A and the labeled file in the public illegal region set a as training samples, and performing second iterative training on the coding-decoding model by taking the labeled file in the real map set B and the labeled file in the real illegal region set B as training samples to obtain a kitchen illegal behavior detection model.
And a violation detection module: and acquiring an image shot by the kitchen camera in real time, and inputting the image into a kitchen violation detection model for violation detection.
In one possible embodiment, the sample library construction module is specifically configured to:
balancing the sample distribution in the public atlas A to obtain a public atlas A with uniformly distributed samples;
and performing image enhancement on the samples in the public atlas A with the uniformly distributed samples to obtain the public atlas A with diverse sample data.
And marking the violation region of the kitchen violation behavior picture in the public atlas A by using a marking tool, generating the marking file, and constructing a public violation region set a by using the marking file as sample data.
In one possible embodiment, the model training module is specifically configured to:
performing feature matching training on the coding-decoding model to obtain a detection model with accurate feature matching;
and carrying out violation classification training on the detection model with the accurate feature matching.
When the intersection ratio IOU1 is smaller than a first threshold value R1 in a preset threshold value set R, judging that the prediction is wrong, and finishing the training;
comparing the odds ratio IOU1 with a second threshold R2 of the preset set of thresholds R when the odds ratio IOU1 is greater than the first threshold R1;
when the intersection ratio IOU1 is less than the second threshold R2, determining that the prediction is wrong;
and when the cross-over ratio IOU1 is larger than the second threshold value R2, taking the result of the original prediction interval [ R1, R2] as a negative sample, balancing the positive and negative samples for retraining, calculating the cross-over ratio IOU2 again and continuously comparing with the next threshold value in the preset threshold value set R, iteratively improving the cross-over ratio, and finishing the feature matching training if the prediction is wrong or all the threshold values in the preset threshold value set R are compared.
Calculating a cross entropy loss function value between the encoding-decoding model input layer and the output layer;
and when the cross entropy loss function value is larger than a preset threshold value K, adjusting the classification network parameters of the coding-decoding model according to the cross entropy loss function value to obtain a new output result, recalculating the cross entropy loss function value between the input layer and the output layer of the coding-decoding model, and ending the training until the cross entropy loss function value is smaller than the preset threshold value K through multiple iterations.
And inputting the training sample into an encoder and encoding the sample by using the encoder to obtain an m-dimensional vector for calculating the cross entropy loss function value.
A kitchen violation detection apparatus comprising a memory and a processor, the memory having stored therein computer-readable instructions that, when executed by the processor, cause the processor to perform the kitchen violation detection method described above.
A storage medium having computer-readable instructions stored thereon, which, when executed by one or more processors, cause the one or more processors to perform the above-described method of kitchen violation detection.
Compared with the prior art: the method comprises the steps of constructing a public map set A by obtaining a kitchen violation picture disclosed on a network, obtaining a kitchen violation picture captured by a camera in an actual kitchen scene, constructing a real map set B, constructing a public violation area set a by applying the violation picture in the public map set A, and constructing a real violation area set B by applying the violation picture in the real map set B; constructing a coding-decoding detection model based on a convolutional neural network structure, carrying out first iterative training on the coding-decoding model by taking illegal pictures of the public atlas A and labeled files in the public illegal region set a as training samples, and then carrying out second iterative training on the coding-decoding model by taking labeled files in the real atlas B and the real illegal region set B as training samples to obtain a kitchen illegal behavior detection model; and acquiring an image shot by the kitchen camera in real time, and inputting the image into a kitchen violation detection model for violation detection. Therefore, the kitchen violation detection method with strong generalization capability and high detection accuracy is realized.
Drawings
Various other drawbacks and benefits will become apparent to those skilled in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application.
FIG. 1 is a general flow diagram of a kitchen violation detection method in one embodiment of the present application;
FIG. 2 is a block diagram of an encoding-decoding model of the present application in one embodiment.
Fig. 3 is a block diagram of a kitchen violation detection device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" may include the plural forms as well, unless expressly stated otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Fig. 1 is an overall flowchart of a kitchen violation detection method in an embodiment of the present application, and as shown in fig. 1, a kitchen violation detection method includes the following steps:
s1, obtaining a kitchen violation picture disclosed on a network, constructing a public atlas A, obtaining a kitchen violation picture captured by a camera in an actual kitchen scene, constructing a real atlas B, constructing a public violation area set a by applying the violation picture in the public atlas A, and constructing a real violation area set B by applying the violation picture in the real atlas B;
acquiring kitchen violation pictures in AN Imagenet (visual database) data set and a Coco (visual database) data set as a first picture sample set, searching a kitchen violation picture second picture sample set of a kitchen violation picture of a kitchen enterprise by using a network search engine, collecting the first picture sample set and the second picture sample set to obtain a final violation picture sample set, and constructing a public picture set A = { A1, A2, …, AN }; and (3) acquiring video monitoring of a kitchen, taking 5000 video frames containing violation behaviors as violation pictures, and constructing a real atlas B = { B1, B2, …, BN }. In this embodiment, 5000 cases of each violating behavior are labeled as training sample data. Where N is the total number of violation pictures, i.e., 5000 in this implementation.
Because the cost of acquiring and labeling data of an actual scene is high, the model is trained on an open data set firstly, and then is transferred to the data of the actual scene for fine adjustment after the model is converged. And freezing the pre-trained shallow network parameters of the model, training and optimizing deep network parameters by using actual data, iterating until the model converges, and then optimizing the shallow network parameters to obtain final model parameters.
In one embodiment, the building of the public atlas a in S1 may include the following steps:
balancing the sample distribution in the public atlas A to obtain a public atlas A with uniformly distributed samples;
balancing the distribution of the picture samples from multiple angles, for example, adding illegal action pictures wearing different types of hats, which are common in kitchen scenes such as disposable hats and cloth hats; illegal action pictures of wearing caps with different colors are added, for example, a chef wears a white cap with golden stripes, and a common chef wears a white cap without stripes; and adding illegal action pictures for wearing caps with different shapes, such as square or oblate caps.
And performing image enhancement on the samples in the public atlas A with the uniformly distributed samples to obtain the public atlas A with diverse sample data.
Image enhancement is achieved by adding some information or transformation data to the original image by some means to selectively highlight features of interest in the image or to suppress (mask) some unwanted features in the image to match the image to the visual response characteristics. In the embodiment, the image enhancement methods mainly adopted, such as horizontal flipping, image scaling, image capturing, image rotation, and the like, increase the diversity of sample data.
Similarly, the above steps may be included after the real atlas B is constructed.
According to the embodiment, the sample data is uniformly distributed, the diversity of the sample data is increased, the model has high robustness, and the method can adapt to different scenes.
In one embodiment, the constructing a public violation area set a by using the violation pictures in the public atlas a in S1 may include the following steps:
and marking the violation region of the kitchen violation behavior picture in the public atlas A by using a marking tool, generating a public violation region marking file, and constructing a public violation region set a by using the public violation region marking file as sample data.
And framing the violation area of each type of violation scene in the picture by using a marking tool, marking a violation type label, generating a corresponding xml file by using the marking tool, wherein the corresponding xml file comprises the coordinates and the label of the marking frame in the picture, and the violation area is specifically a set of pixel point positions of violation behaviors in the corresponding violation picture. For example, for a violation picture with a violation behavior of "wearing a chef hat without rules", the violation area is the position of the pixel point on the head of the chef in the violation picture, the violation type label is "wearing a hat without rules", and the generated xml file includes the coordinates of the violation area and the violation label "wearing a hat without rules".
The construction of the real violation area set b may include the same steps described above.
In the embodiment, the distribution condition of the violation area and all types of the violation labels are determined, and a strict violation standard is provided for the following violation behavior model training.
S2, constructing a coding-decoding detection model based on a convolutional neural network structure;
the coding-decoding model is realized by adopting the current convolutional neural network structure in the computer vision field, such as L eNet-5, VGG, AlexNet, Googlenet. convolutional neural network modeling biological visual perception (visual perception) mechanism, can perform supervised learning and unsupervised learning, and the structure comprises an input layer, an implicit layer and an output layer, wherein the convolutional kernel parameter sharing in the implicit layer and the sparsity of interlayer connection enable the convolutional neural network to learn grid-like characteristics (grid-like) such as pixels and audio with small calculation amount, have stable effect and have no additional characteristic engineering (feature) requirement on data.
In a good embodiment, a VGG network structure is adopted to construct an encoding-decoding model, and the encoding-decoding model comprises an encoder and a decoder, wherein the decoder is a decoupling structure of the encoder. The input layer of the coding-decoding model is used for inputting pictures, the coder and the decoder are used for the standardized processing of the pictures, please refer to fig. 2, the input of the coder is a picture I, the output is a coding result X, the coding result is a vector with m dimensions, the cross entropy loss function is mainly and conveniently used for calculating the cross entropy loss function of the model, and the ending time of the model training is controlled by comparing the value of the cross entropy loss function with the preset threshold P1. The input of the decoder is the encoding result X of the encoder, and the output is the picture Id, i.e.:
Figure 686476DEST_PATH_IMAGE001
Figure 988145DEST_PATH_IMAGE002
s3, carrying out first iterative training on the coding-decoding model by taking the illegal picture of the public map set A and the labeled file in the public illegal region set a as training samples, and then carrying out second iterative training on the coding-decoding model by taking the labeled file in the real map set B and the labeled file in the real illegal region set B as training samples to obtain a kitchen illegal behavior detection model;
it should be emphasized that, in order to further ensure the privacy and security of the detection model data information, the detection model data information may also be stored in a node of a block chain.
In one embodiment, the step S3, before the first iterative training and the second iterative training, further includes:
and inputting the training sample into an encoder and encoding the sample by using the encoder to obtain an m-dimensional vector for calculating the cross entropy loss function value.
After the first iterative training and the second iterative training in step S3, the method further includes:
and decoding the m-dimensional vector by using the decoder to obtain the ID of the picture.
In this embodiment, conversion between input and output of the encoding-decoding model is realized.
The cross entropy can be used as a loss function in a neural network (machine learning), p represents the distribution of real marks, q is the distribution of predicted marks of the trained model, and the cross entropy loss function can measure the similarity between p and q. The cross entropy as the loss function has the advantage that the problem of the learning rate reduction of the mean square error loss function can be avoided when the gradient is reduced by using the sigmoid function, because the learning rate can be controlled by the output error. In feature engineering, it can be used to measure the similarity between two random variables.
In the training stage, a training picture and a corresponding marking file are input into a model, the model extracts features through a multilayer convolution network, the position of a target is located according to feature matching, and then the violation type is obtained through a classification network. The training process of the model is driven by data, and final model parameters are obtained by minimizing a cross entropic loss function without manually adjusting the parameters. The number of layers of the model exceeds 50 layers, deep information in the image can be extracted, the generalization capability of the model in detection is stronger, and parameters of the model can be continuously optimized along with the continuous increase of training samples, so that the detection accuracy is improved.
In one embodiment, the first iterative training in step S3 includes the following steps:
performing feature matching training on the coding-decoding model to obtain a detection model with accurate feature matching;
and carrying out violation classification training on the detection model with the accurate feature matching.
The second iterative training in step S3 includes the same steps as described above, and the training samples used mainly differ between the first iterative training and the second iterative training, where the former uses public data and the latter uses actual scene data.
The method comprises the following steps that a cross-over ratio iterative lifting method is adopted in feature matching training, and the accuracy of the position of a prediction frame is improved;
the intersection ratio (IOU) is the coincidence degree of the prediction frame and the target object, and the specific calculation formula of the IOU is as follows:
Figure 161768DEST_PATH_IMAGE003
wherein, S1 is the area of the intersection of the prediction box boundary and the actual boundary, and S2 is the entire area of the prediction box.
The current method for determining the violation is that the prediction is correct when the cross-over ratio is larger than a certain set threshold, and generally, the detection result is correct when the cross-over ratio is larger than 0.5. In the step, a method for improving the prediction accuracy by iteratively improving the cross-over ratio is adopted, wherein a threshold set R = { R1, R2, …, Rn } is preset in the method, wherein n represents the number of thresholds, R1< R2< … < Rn, and the size of the thresholds in the threshold set R and the number n of the thresholds are determined by effects required by experiments. In one embodiment, the threshold set R = {0.3, 0.4, 0.5, 0.6}, and the specific feature matching training steps are as follows:
when the intersection ratio IOU1 is smaller than a first threshold value 0.3 in a preset threshold value set R, judging that the prediction is wrong, and finishing training;
when the intersection ratio IOU1 is greater than the first threshold value 0.3, comparing the intersection ratio IOU1 with a second threshold value 0.4 in the preset threshold value set R;
when the intersection ratio IOU1 is less than the second threshold value 0.4, judging that the prediction is wrong;
and when the cross-over ratio IOU1 is greater than the second threshold value 0.4, taking the result of the original prediction interval [ R1, R2] as a negative sample, balancing the positive and negative samples for retraining, calculating the cross-over ratio IOU2 again, continuously comparing with the next threshold value 0.5 in the preset threshold value set R, iteratively improving the cross-over ratio, and finishing the feature matching training if the prediction is wrong or all the threshold values in the preset threshold value set R are compared.
In the embodiment, a position of a relatively higher intersection is obtained than a position of a corresponding prediction frame, so that the prediction frame can be more accurately matched with the position area corresponding to the feature.
And the accuracy of model classification is measured by using cross entropy in violation classification training, the positioning error of the model is measured by using L1 norm, the result output by each model is compared with the result manually marked to obtain an error, parameters are corrected according to the error, and model training is completed after multiple iterations until the error is less than a set threshold value, and the final model parameters are stored.
Taking the illegal picture and the annotation file as samples, training the coding-decoding model by using a random gradient descent method until a cross entropy loss function between input data and output data of the coding-decoding model converges to a first threshold, wherein the first threshold is preferably 0.001. Wherein the cross entropy loss function is specifically as follows:
Figure 548887DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 981005DEST_PATH_IMAGE005
as weights, dependent on the pixel point
Figure 871601DEST_PATH_IMAGE006
If the pixel point is
Figure 481705DEST_PATH_IMAGE006
Is located in the violation area of the corresponding picture, then
Figure 90541DEST_PATH_IMAGE005
Figure 264033DEST_PATH_IMAGE007
,0.6≤
Figure 337032DEST_PATH_IMAGE008
1 or less, otherwise
Figure 304988DEST_PATH_IMAGE009
=1-
Figure 211239DEST_PATH_IMAGE008
Figure 922843DEST_PATH_IMAGE010
For pixel points in illegal pictures
Figure 850348DEST_PATH_IMAGE006
Image ofThe prime value;
Figure 989205DEST_PATH_IMAGE011
for pixel points in output result of coding-decoding model with the illegal picture as input
Figure 385683DEST_PATH_IMAGE006
The pixel value of (2).
The violation classification capability of the detection model is trained, so that the judgment of the violation type of the image by the model is more accurate.
S4, acquiring an image shot by the kitchen camera in real time, and inputting the image into a kitchen violation detection model for violation detection;
in the using stage, as long as the detection model of the kitchen violation behavior is loaded into the network framework, and the picture shot by the kitchen camera is input, the model can perform feature extraction, positioning and classification on the picture, and the violation behavior is detected.
According to the embodiment, the behavior action of the worker can be conveniently and accurately captured in the kitchen scene, and whether violation is caused or not is judged without manual supervision.
Fig. 3 is a structural diagram of a kitchen violation detection device in an embodiment of the present application, and as shown in fig. 3, a kitchen violation detection device includes the following modules:
the sample library construction module 10 is used for acquiring a kitchen violation picture disclosed on a network, constructing a public atlas A, acquiring a kitchen violation picture captured by a camera in an actual kitchen scene, constructing a real atlas B, constructing a public violation area set a by using the violation picture in the public atlas A, and constructing a real violation area set B by using the violation picture in the real atlas B;
a model initialization module 20, configured to construct a convolutional neural network structure-based encoding-decoding detection model;
and the model training module 30 is configured to perform first iterative training on the coding-decoding model by using the illegal picture in the public map set a and the labeled file in the public illegal region set a as training samples, and perform second iterative training on the coding-decoding model by using the labeled file in the real map set B and the labeled file in the real illegal region set B as training samples, so as to obtain a kitchen illegal behavior detection model.
And the violation detection module 40 is used for acquiring an image shot by the kitchen camera in real time and inputting the image into the kitchen violation detection model for violation detection.
Wherein the memory has stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the steps of the above-described method of kitchen violation detection.
In one embodiment, a storage medium storing computer-readable instructions is provided, the computer-usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by hardware instructions related to a program, and the program may be stored in a computer readable storage medium, which includes: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The technical features of the embodiments described above can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-described embodiments are merely illustrative of some embodiments of the present application, which are described in more detail and detail, but are not to be construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent should be subject to the appended claims.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Claims (10)

1. A kitchen violation detection method, comprising:
acquiring a kitchen violation picture disclosed on a network, constructing a public atlas A, acquiring a kitchen violation picture captured by a camera in an actual kitchen scene, constructing a real atlas B, constructing a public violation area set a by applying the violation picture in the public atlas A, and constructing a real violation area set B by applying the violation picture in the real atlas B;
constructing a coding-decoding detection model based on a convolutional neural network structure;
carrying out first iterative training on the coding-decoding model by taking the illegal picture of the public map set A and the labeled file in the public illegal region set a as training samples, and then carrying out second iterative training on the coding-decoding model by taking the labeled file in the real map set B and the labeled file in the real illegal region set B as training samples to obtain a kitchen illegal behavior detection model;
and acquiring an image shot by the kitchen camera in real time, and inputting the image into a kitchen violation detection model for violation detection.
2. The method of kitchen violation detection according to claim 1, wherein after constructing public atlas a, the method further comprises:
balancing the sample distribution in the public atlas A to obtain a public atlas A with uniformly distributed samples;
and performing image enhancement on the samples in the public atlas A with the uniformly distributed samples to obtain the public atlas A with diverse sample data.
3. The method for detecting kitchen violation according to claim 1, wherein the applying the violation pictures in the public atlas a to construct a public violation area set a comprises:
and marking the violation region of the kitchen violation behavior picture in the public atlas A by using a marking tool, generating the marking file, and constructing a public violation region set a by using the marking file as sample data.
4. The method of kitchen violation detection according to claim 1, wherein said first iterative training of said codec model comprises:
performing feature matching training on the coding-decoding model to obtain a detection model with accurate feature matching;
and carrying out violation classification training on the detection model with the accurate feature matching.
5. The method of kitchen violation detection according to claim 4, wherein said feature matching training the code-decode model comprises:
when the intersection ratio IOU1 is smaller than a first threshold value R1 in a preset threshold value set R, judging that the prediction is wrong, and finishing the training;
comparing the odds ratio IOU1 with a second threshold R2 of the preset set of thresholds R when the odds ratio IOU1 is greater than the first threshold R1;
when the intersection ratio IOU1 is less than the second threshold R2, determining that the prediction is wrong;
and when the cross-over ratio IOU1 is larger than the second threshold value R2, taking the result of the original prediction interval [ R1, R2] as a negative sample, balancing the positive and negative samples for retraining, calculating again to obtain the cross-over ratio IOU2, continuously comparing with the next threshold value in the preset threshold value set R, iteratively improving the cross-over ratio, and finishing the feature matching training if the prediction is wrong or all the threshold values in the preset threshold value set R are compared.
6. The method for kitchen violation detection according to claim 4, wherein the performing violation classification training on the detection model with accurate feature matching comprises:
calculating a cross entropy loss function value between the encoding-decoding model input layer and the output layer;
and when the cross entropy loss function value is larger than a preset threshold value K, adjusting the classification network parameters of the coding-decoding model according to the cross entropy loss function value to obtain a new output result, recalculating the cross entropy loss function value between the input layer and the output layer of the coding-decoding model, and ending the training until the cross entropy loss function value is smaller than the preset threshold value K through multiple iterations.
7. The method of kitchen violation detection according to claim 6, wherein said training a first iteration of an encoding-decoding model further comprises:
and inputting the training sample into an encoder and encoding the training sample by using the encoder to obtain an m-dimensional vector for calculating the cross entropy loss function value.
8. A kitchen violation detection device is characterized by comprising the following modules:
a sample library construction module: acquiring a kitchen violation picture disclosed on a network, constructing a public atlas A, acquiring a kitchen violation picture captured by a camera in an actual kitchen scene, constructing a real atlas B, constructing a public violation area set a by applying the violation picture in the public atlas A, and constructing a real violation area set B by applying the violation picture in the real atlas B;
a model initialization module: constructing a coding-decoding detection model based on a convolutional neural network structure;
a model training module: carrying out first iterative training on the coding-decoding model by taking the illegal picture of the public map set A and the labeled file in the public illegal region set a as training samples, and then carrying out second iterative training on the coding-decoding model by taking the labeled file in the real map set B and the labeled file in the real illegal region set B as training samples to obtain a kitchen illegal behavior detection model;
and a violation detection module: and acquiring an image shot by the kitchen camera in real time, and inputting the image into a kitchen violation detection model for violation detection.
9. A kitchen violation detection device comprising a memory and a processor, the memory having stored therein computer-readable instructions that, when executed by the processor, cause the processor to perform the kitchen violation detection method of any of claims 1-7.
10. A storage medium having stored thereon computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the method of galley violation detection of any of claims 1-7.
CN202010617901.0A 2020-07-01 2020-07-01 Detection method, device, equipment and storage medium for kitchen violation behaviors Pending CN111507320A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010617901.0A CN111507320A (en) 2020-07-01 2020-07-01 Detection method, device, equipment and storage medium for kitchen violation behaviors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010617901.0A CN111507320A (en) 2020-07-01 2020-07-01 Detection method, device, equipment and storage medium for kitchen violation behaviors

Publications (1)

Publication Number Publication Date
CN111507320A true CN111507320A (en) 2020-08-07

Family

ID=71875340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010617901.0A Pending CN111507320A (en) 2020-07-01 2020-07-01 Detection method, device, equipment and storage medium for kitchen violation behaviors

Country Status (1)

Country Link
CN (1) CN111507320A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183265A (en) * 2020-09-17 2021-01-05 国家电网有限公司 Electric power construction video monitoring and alarming method and system based on image recognition
CN112507912A (en) * 2020-12-15 2021-03-16 网易(杭州)网络有限公司 Method and device for identifying illegal picture
CN112862519A (en) * 2021-01-20 2021-05-28 北京奥维云网大数据科技股份有限公司 Sales anomaly identification method for retail data of electric business platform household appliances
CN113177519A (en) * 2021-05-25 2021-07-27 福建帝视信息科技有限公司 Density estimation-based method for evaluating messy differences of kitchen utensils
CN117278696A (en) * 2023-11-17 2023-12-22 西南交通大学 Method for editing illegal video of real-time personal protective equipment on construction site

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082172A1 (en) * 2015-03-12 2018-03-22 William Marsh Rice University Automated Compilation of Probabilistic Task Description into Executable Neural Network Specification
US20190188376A1 (en) * 2017-12-19 2019-06-20 Western Digital Technologies, Inc. Apparatus and method of detecting potential security violations of direct access non-volatile memory device
CN110163143A (en) * 2019-05-17 2019-08-23 国网河北省电力有限公司沧州供电分公司 Unlawful practice recognition methods, device and terminal device
CN110378311A (en) * 2019-07-25 2019-10-25 杭州视在科技有限公司 Violation judgment method in kitchen after food and drink based on Encoder-Decoder model and mixed Gauss model
CN110659597A (en) * 2019-09-11 2020-01-07 安徽超清科技股份有限公司 Bright kitchen range management system based on big data
WO2020064093A1 (en) * 2018-09-25 2020-04-02 Nokia Technologies Oy End-to-end learning in communication systems
CN111178458A (en) * 2020-04-10 2020-05-19 支付宝(杭州)信息技术有限公司 Training of classification model, object classification method and device
CN111291190A (en) * 2020-03-23 2020-06-16 腾讯科技(深圳)有限公司 Training method of encoder, information detection method and related device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082172A1 (en) * 2015-03-12 2018-03-22 William Marsh Rice University Automated Compilation of Probabilistic Task Description into Executable Neural Network Specification
US20190188376A1 (en) * 2017-12-19 2019-06-20 Western Digital Technologies, Inc. Apparatus and method of detecting potential security violations of direct access non-volatile memory device
WO2020064093A1 (en) * 2018-09-25 2020-04-02 Nokia Technologies Oy End-to-end learning in communication systems
CN110163143A (en) * 2019-05-17 2019-08-23 国网河北省电力有限公司沧州供电分公司 Unlawful practice recognition methods, device and terminal device
CN110378311A (en) * 2019-07-25 2019-10-25 杭州视在科技有限公司 Violation judgment method in kitchen after food and drink based on Encoder-Decoder model and mixed Gauss model
CN110659597A (en) * 2019-09-11 2020-01-07 安徽超清科技股份有限公司 Bright kitchen range management system based on big data
CN111291190A (en) * 2020-03-23 2020-06-16 腾讯科技(深圳)有限公司 Training method of encoder, information detection method and related device
CN111178458A (en) * 2020-04-10 2020-05-19 支付宝(杭州)信息技术有限公司 Training of classification model, object classification method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
魏泽发: "基于深度学习的出租车司机违规行为检测", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183265A (en) * 2020-09-17 2021-01-05 国家电网有限公司 Electric power construction video monitoring and alarming method and system based on image recognition
CN112507912A (en) * 2020-12-15 2021-03-16 网易(杭州)网络有限公司 Method and device for identifying illegal picture
CN112507912B (en) * 2020-12-15 2024-06-11 杭州网易智企科技有限公司 Method and device for identifying illegal pictures
CN112862519A (en) * 2021-01-20 2021-05-28 北京奥维云网大数据科技股份有限公司 Sales anomaly identification method for retail data of electric business platform household appliances
CN113177519A (en) * 2021-05-25 2021-07-27 福建帝视信息科技有限公司 Density estimation-based method for evaluating messy differences of kitchen utensils
CN113177519B (en) * 2021-05-25 2021-12-14 福建帝视信息科技有限公司 Density estimation-based method for evaluating messy differences of kitchen utensils
CN117278696A (en) * 2023-11-17 2023-12-22 西南交通大学 Method for editing illegal video of real-time personal protective equipment on construction site
CN117278696B (en) * 2023-11-17 2024-01-26 西南交通大学 Method for editing illegal video of real-time personal protective equipment on construction site

Similar Documents

Publication Publication Date Title
CN110909651B (en) Method, device and equipment for identifying video main body characters and readable storage medium
CN111507320A (en) Detection method, device, equipment and storage medium for kitchen violation behaviors
CN111325115B (en) Cross-modal countervailing pedestrian re-identification method and system with triple constraint loss
CN103824055B (en) A kind of face identification method based on cascade neural network
CN109145745B (en) Face recognition method under shielding condition
US11055538B2 (en) Object re-identification with temporal context
CN110851835A (en) Image model detection method and device, electronic equipment and storage medium
CN111160313B (en) Face representation attack detection method based on LBP-VAE anomaly detection model
CN105335726B (en) Recognition of face confidence level acquisition methods and system
CN108564040B (en) Fingerprint activity detection method based on deep convolution characteristics
CN104636730A (en) Method and device for face verification
CN113449704B (en) Face recognition model training method and device, electronic equipment and storage medium
CN108108760A (en) A kind of fast human face recognition
CN113205002B (en) Low-definition face recognition method, device, equipment and medium for unlimited video monitoring
CN115050064A (en) Face living body detection method, device, equipment and medium
Li et al. Face liveness detection and recognition using shearlet based feature descriptors
Wu et al. Comparative analysis and application of LBP face image recognition algorithms
Naveen et al. Face recognition and authentication using LBP and BSIF mask detection and elimination
CN109101984B (en) Image identification method and device based on convolutional neural network
Jiang et al. Reconstructing recognizable 3d face shapes based on 3d morphable models
Oloyede et al. Evaluating the effect of occlusion in face recognition systems
CN116310474A (en) End-to-end relationship identification method, model training method, device, equipment and medium
CN115909421A (en) Intelligent door lock face recognition encryption method and system
KR102540290B1 (en) Apparatus and Method for Person Re-Identification based on Heterogeneous Sensor Camera
CN114898137A (en) Face recognition-oriented black box sample attack resisting method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200807